Open Stack

Tue Nov 6 11:35:29 UTC 2012

Hi Adam,

(Breaking out acronyms for those following along...)

The problem with PKI and running an internal Certificate Authority (CA) type
service is that to do it effectively you really need a capable Registration
Authority (RA) backing the CA. The CA after all really just signs stuff, and
keeps track of what it signed so it can revoke it if required. The role of
the RA is to confirm that the submitter of a Certificate Signing Request
(CSR) has a right to request that the assertions made in the CSR (Common
Name, Extended Key Usage, etc) be cryptographically validated by the CA.
That is to say that the CA rubber stamps a decision made by an RA. 

What we find is that this model often falls down terribly when used inside
networks. You tend to have all manner of devices using certificates that are
requested automagically from a CA and care must be taken to ensure that this
is done in a valid way - this is ok if you trust your internal network but
if you _really_ trusted your internal network you're unlikely to be
deploying SSL between components. In your email you describe that with
Keystone you'll be issuing out an OTP that's used during the key-derivation
process and that this is done on a user basis, not machine. That makes sense
but of course what we have here is something quite different.

I believe there are two options here, role based signing and entity based
signing note that I'm not calling out any specific crypto here, there's a
bunch of stuff that could be used. In roll based signing each role
nova-host, nova-network, nova-etc would share a signing-key and any system
receiving a signed message from a role can verify that the message
originated from a machine of that roll. The alternative is entity based
signing where each host is given it's own signing-key and when a system
receives a message it can verify exactly which machine the message came
from. I'd be interested to know which method people felt was most
appropriate. The latter often appears to be more secure but it's possible to
argue that this is outweighed by the extra overhead. Each machine in your
system now needs to know about the signing key of every other machine -
which makes key-rolling/revocation painful and it doesn't isolate any
potential attacks; most people who deploy at any scale use configuration
management to keep systems at the same patch level which means that within
the datacentre you have a flat exploitation space - if I compromised one
nova-host I can compromise the rest with the same exploit - so protecting
the signed-keys from individual compromise (by having entity based keys)
sometimes buys you very little.

Your email raises a bunch of interesting points and personally I'd like to
see more discussion of the Keystone PKI stuff in another thread, perhaps
with a specific OSSG focus. 

> -----Original Message-----
> From: Adam Young [mailto:ayoung at redhat.com]
> I've been working on Certificate and Key Management as part of the Token
API
> for a while now.  Here is my current design:
> 
>   Keystone is a trusted broker.  It maintains the identity of users in the
OpenStack
> family of servers.  Services and Endpoints must be registered with
Keystone.
> Part of that registration process is either posting a certificate to use,
or posting
> a CSR to be signed.
> 
> Certificates are owned by users, not services, not endpoints. When a PKI
token is
> signed,  the users ID (UUID by default) is part of the signed document.
When a
> remote system validates the signature of a token, they fetch the
certificate on
> demand from Keystone. There is the possibility that a user might be in
process of
> updating a certificate that is about to expire,  and will have more than
one
> certificate active.  It will likely be limited to two active at any point
in time.
> 
> Keystone will have the ability to respond with a list of URLs for the
certificates
> that can be fetched individually.  This allows the certificates themselves
to be
> directly addressable resources, and to compy with the expected formats for
> X509 distribution over the web (PEM,
> etc)
> 
> There is also an X509 specific way to indicate what certificate signed a
> document without having to include the whole X509 certificate in the token
> itself.  We might use this as the primary method, but it does
> bypass the User association.   There is a second level of check that we
> need to provide, which is: was this user allowed to sign this token.
> With domains, it is possible that a user might try to sign a token for a
resources
> in a different domain, and thus a signed token that is valid by X509
perspective is
> not valid from an OpenStack perspective.
> 
> 
> I think all of these issues apply for signed RPC messages.
> 
> 
> 
> On 10/24/2012 08:48 AM, Daniel P. Berrange wrote:
> > On Wed, Oct 24, 2012 at 12:11:50PM +0000, Clark, Robert Graham wrote:
> >>> -----Original Message-----
> >> How many of the Nova API messages really need to be encrypted? If
> >> it's few then I'd suggest there are probably better approaches than
> >> encrypting everything.
> > Not many messages need to have encryption - minimally just those which
> > are transporting encryption keys, but once you add support for
> > encryption to the RPC layer the decision about whether to encrypt
> > everything or only some messages comes down to two core factors
> >
> >   - Whether it is a unicast or broadcast message
> >   - Whether the computational overhead is acceptable.
> >
> > Given that modern CPUs include AES support in hardware, the
> > computational overheads may not matter significantly. TBD based on
> > real world testing of course.
> >
> > For broadcast messages you have the problem of what key to encrypt with.
> > So it is probably simplest to just not encrypt broadcast messages.
> >
> >>            A big issue in nova is that various components have
> >> different attack surfaces, which makes them more or less likely to be
> >> compromised than other components. Having each component sign
> >> messages before transmission would deal with the most immediate
> >> threat: An attacker on your network can control _everything_ by
> >> watching the wire and injecting their own packets.
> > For the purposes of disk encryption, I've been working on the basis
> > that in Nova the compute nodes should be considered the least trusted
> > part of the system and their access / abilities must be tightly
> > controlled / limited as compared to the api / schedular nodes.
> >
> >> The next step (that's really needed before encryption makes much
> >> sense) is to shoehorn some sort of RBAC into Nova so that when a
> >> component receives a signed message (signing proves it's from "X") it
> >> can check that "X" is in the right role to send that message. Once
> >> this is done we start having some pretty robust mechanisms for
> >> controlling data inside of Nova.
> > Yes, you absolutely need to have a way to identify what role a sender
> > is permitted to be in. For example, to be able to prevent a
> > compromised compute node from sending a 'boot VM' message - only the
> > schedular can be allowed to send that particular message. And so on.
> >
> >> Of the two things I've mentioned so far, signing and RBAC, signing
> >> probably seems to be the easiest but you need to decide how you're
> >> going to sign and what crypto primitives we're going to use (this is
> >> relevant to the encryption discussion too). At the design summit
> >> someone suggested sending a X.509 public certificate along with a
> >> signed call - this is kind of crazy, there's no way we can send
> >> around 1.5kb of extra data for each message, you could send the
> >> public key lets assume we're using 2048bit RSA - that still means
> >> we're adding 256bytes to every message. In either case you'd need a
> >> way to check the public component was valid and not revoked. I think
> >> there are probably some smart things that can be done by just sending
> >> an X.509 fingerprint or a keyID and having centrally distributed,
> >> cryptographically verifiable local lookup caches - (this is also a
> >> neat way of sidestepping any CRL verification issues : if the cert
> >> isn't in your trust cache, it's not trusted).
> > I would be interested to know the kind of data volumes / patterns seen
> > for the RPC service in the real world, before ruling out the use of
> > x509 certs. A supposed 1.5kb overhead per call needs to be put in
> > context of the overall traffic & capacity before we judge that it is
> > unacceptable overhead. Just sending an x509 fingerprint sounds
> > interesting idea, but I'm not sure what security implications that
> > would have. One of the things I try to remember is that I'm not Bruce
> > Schneier, so it is best to aim for tried-and-tested design patterns,
> > rather than try to invent some special/clear new security system ;-P
> >
> > Personally I'm not so worried about the network traffic volumes.
> > The more important concerns to me are around the overall resillience &
> > computational scalability of the system. In particular making sure you
> > don't have to do something crazy like check keystone (or another
> > centralized auth service) to validate every single RPC call received.
> > Also trying to avoid putting any significant administrative setup or
> > ongoing burden on deploying openstack; retaining the flexility to
> > quickly re-configure the location of nova services, and the speed with
> > which you can react to an hostile attack and isolate compromised
> > services.
> >
> >> At the summit there seemed to be lots of confusion over how this
> >> would actually work and a whole bunch of frankly quite worrying ideas
> >> were being thrown around. This is part of the reason that the
> >> OpenStack Security Group was formed, I can't wait to see your design
> >> document, I hope that it blows away all of the concerns I've listed
> >> above but if it doesn't perhaps you would be open to working with the
> >> OSSG to lock it down or even joining the OSSG if you're "security
> >> inclined" generally. The OSSG is a group of stackers who come from
> >> good security backgrounds and want to help developers get security
> >> right first time around by providing review, advice and consultation
> >> https://launchpad.net/~openstack-ossg
> > Yes, I'm already in the OSSG team.
> >
> > One of the reasons I've not raised any discussions thus far is that
> > I've not finished writing up my ideas in a form that's presentable for
> > discussion. I only mention it today since the topic was brought up
> > recently at the Summit (which I was unable to attend) and I want to
> > make sure all interested parties know about each other's ideas / work
> > to avoid duplication of effort.
> >
> > Regards,
> > Daniel
> 
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6190 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20121106/7784a533/attachment.bin>

Open Stack

[openstack-dev] Secure RPC

OpenStack

Community

Documentation

Branding & Legal