Open Stack

Fri Feb 22 19:03:07 UTC 2013

On 2/21/2013 5:26 PM, Bhandaru, Malini K wrote:
>
> Hello Caitlin!
>
> I have been pondering your comment .. and some issues pertaining to 
> horizontal scaling for high availability. Please let me know if I am
>
> missing something in your suggestion of storage nodes saving 
> encryption keys.
>
> Say on compute-node-1, we have a VM and a persistent volume encrypted 
> with key k1, user is done, volume detached etc.
>
> Later the VM may land on compute-node-2 and key K1 needs to be accessed.
>
> A single copy of the key would mean danger of key loss. Also search 
> for the key should the VM and volume host be different from the prior 
> run (in this case compute-node-2 instead of compute-node-1).  Even 
> snapshots may reside on different storage media and whether we use the 
> same key-string-id or a copy of the key-string and new id.
>
> If we elect to store the encryption keys on the storage servers 
> themselves in secure lock boxes, we would need to replicate the keys 
> on other peer storage servers to ensure access. Mirroring .. like 
> Swift replicas.
>
> This then becomes a swift like storage mechanism with the keys 
> encrypted using the storage server's master key.
>
> I can see a single master key being distributed to the members of the 
> storage cluster, so that they can decrypt a key once they retrieve it 
> from the
>
> Key manager (thinking of it as a single entity even if the data is 
> replicated in multiple places).
>
> With the service endpoint, aka storage node (Cinder/Glance/Swift), 
> being responsible for decrypting the key-string, which happens 
> locally, the communication between key-manager and service-endpoint 
> becomes a less valuable target, the data flying by less vulnerable.
>
> TPM could be used to verify if a storage-endpoint is legitimate 
> (identity and platform integrity) before it could access the master key.
>
> We could have separate master keys for Cinder/Glance/Swift.
>
> To get the protection of a double-key like a bank safe deposit locker, 
> the actual encryption key could be a combination of the 
> domain/account/user specific key and the per entity key. That key too 
> would reside in key-manager. Either through delegation the service 
> endpoint could access it (here I am having trouble with which key to 
> use to encrypt it .. different services would be using the key). While 
> stored it could be encrypted with the key-manager's master key or 
> keystone's master key, but then it would have be passed along to the 
> service after decryption (vulnerable while in transit).
>
> Caching of keys, to reduce network traffic and latency possible, with 
> lifetime equal to token lifetime and usual cache space reclamation 
> policies.
>

In my evaluation there are only two types of storage encryption that 
make any sense: true
end-to-end encryption, and storage server encryption.

There are a lot of merits to true end-to-end encryption. It provides 
complete protection for
the data that is fully under the user's control. The Service Provider 
cannot be forced to divulge
the content of user data for the simple reason that they literally do 
not know how to decrypt
any of the data.

The challenge of true end-to-end encryption is that it is fully under 
the user's control.
If the user forgets their pass phrase to where they have stored their 
keys, the data is lost.
A solution is needed that reliably stores the keys independent of the 
service provider, and
does so in a way that is almost invisible to the user while still being 
under their control.

While I think end-to-end is a great solution, I think it can only be 
enabled by the major
OS vendors (Apple, Google,Linux, Microsoft). Openstack is not 
sufficiently in the true "ends"
of "end-to-end".

The goals of storage server encryption are more modest -- to prevent the 
theft of data drives
or even servers leading to any data being exposed. The user is not 
concerned with how many
keys are used, or which keys are used, just that the data is secure.

The specific concern I have relates to efficiency of replication, local 
caching, taking snapshots
and cloning from snapshots.

Storage-server controlled lockboxes allow encryption/decryption to be 
done efficiently by
the storage servers (in hardware and/or native kernel code, rather than 
from user-mode
python code) and prevent encryption from slowing down basic 
object/volume manipulation
commands such as replication, snapshotting and cloning.

This would be done by having the storage server that creates an asset 
(such as a Cinder
volume or a Swift Partition) to create a Key ID and a secret Key. The 
secret key is not stored
persistently anwhere other than in a secure lockbox, preferably on a TPM 
but definitely on
a different device than the content being referenced.

When a derived object is created (such as  a snapshot, a clone or when 
the asset is replicated)
the same Key ID is used for the new asset. If this asset is being 
replicated, then a secure
transfer of Secret Key must be arranged between the origin lock box and 
the destination
lock box.

The role of openstack should be limited to orchestrating and 
standardizing those transfers.

First, the transferred keys should be signed by the origin lockbox and 
encrypted specifically
for the destination lockbox. This needs to be storage-grade encryption, 
not the flimsy
encryption used for ephemeral transfer protocols such as 
TLS/SSL/HTTPS/IPSEC. Encryption
for transport protocols do not have to worry about their code being 
cracked in a month
because they rekey more frequently than that. A stolen disk drive that 
yields it's secrets
after a month is a major problem. Using keystone as a central repository 
for the keys does
not make them safe, it provides a one stop convenience store for the 
would be attacker.

As for the number of replicas of the keys, the keys are stored in as 
many locations as the
asset itself is stored. The specific storage service (Swift or Cinder) 
should already have
a solution for ensuring the appropriate level of protection against loss 
of a single server.

One specific form of replication that this enables is partial caching of 
assets near the compute
node. Once the relevant secret key is shared between the local storage 
server and the
persistent central server. Updates are posted first to the local storage 
server, which
asynchronously forwards them to the central server while the local 
storage server
only keeps the most recently referenced subset of the asset(s). This can 
provide
effectively local storage (in terms of performance) with the capacity of 
a central
server. A typical compute node owns a lot more storage than it will 
reference in
the next hour. Using local SSD storage for data that will not be 
referenced today
is wasteful, providing a unified view of local and central storage is 
something that
storage vendors can provide.

Self-encrypting drives are also a wonderful solution for offloading work 
from the CPUs,
but they need vendor-independent control over their keys in order to be 
a scalable
enterprise solution. Openstack could provide the guidance that would 
enable self-
encrypting drives to be an acceptable solution.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20130222/14926c3d/attachment.html>

Open Stack

[openstack-dev] Volume Encryption

OpenStack

Community

Documentation

Branding & Legal