[operators][cinder][nova][glance] possible data loss situation (bug 1852106)
Brian Rosmaita
rosmaita.fossdev at gmail.com
Tue Feb 18 22:32:38 UTC 2020
tl;dr: If you are running the OpenStack Train release, add the following
to the [DEFAULT]non_inheritable_image_properties configuration option
[0] in your nova.conf:
cinder_encryption_key_id,cinder_encryption_key_deletion_policy
About Your nova.conf
====================
- if you already have a value set for non_inheritable_image_properties,
add the above to what you currently have
- if non_inheritable_image_properties is *not* set in your current
nova.conf, you must set its value to include BOTH the above properties
AND the default values (which you can see when you generate a sample
nova configuration file [1])
NOTE: At a minimum, the non_inheritable_image_properties list should
contain:
- the properties used for image-signature-validation (these are not
transferrable from image to image):
* img_signature_hash_method
* img_signature
* img_signature_key_type
* img_signature_certificate_uuid
- the properties used to manage the keys for images of cinder encrypted
volumes:
* cinder_encryption_key_id
* cinder_encryption_key_deletion_policy
- review the documentation to determine whether you want to include the
other default properties:
* cache_in_nova
* bittorrent
Details
=======
This issue is being tracked as Launchpad bug 1852106 [2].
This is probably a low-occurrence situation, because in order for the
issue to occur, all of the following must happen:
(0) using the OpenStack Train release (or code from master (Ussuri
development))
(1) cinder_encryption_key_id and cinder_encryption_key_deletion_policy
are NOT included in the non_inheritable_image_properties setting in
nova.conf (which they aren't, by default)
(2) a user has created a volume of an encrypted volume-type in the Block
Storage service (cinder). Call this Volume-1
(3) using the Block Storage service, the user has uploaded the encrypted
volume as an image to the Image service (glance). Call this Image-1
(4) using the Compute service (nova), the user has attempted to directly
boot a server from the image. (Note: this is an unsupported action, the
supported workflow is to use the image to boot-from-volume.)
(5) although an unsupported action, if a user does (4), it currently
results in a server in status ACTIVE but which is unusable because the
operating system can't be found
(6) using the Compute service, the user requests the createImage action
on the unusable (yet ACTIVE) server, resulting in Image-2
(7) using the Image service, the user deletes Image-2 (which has
inherited the cinder_encryption_key_* properties from Image-1) upon
which the encryption key is deleted, thereby rendering Image-1
non-decryptable so that it can no longer be used in the normal
boot-from-volume workflow
NOTE 1: the cinder_encryption_key_deletion_policy image property was
introduced in Train. In pre-Train releases, deleting the useless
Image-2 in step (7) does NOT result in encryption key deletion.
NOTE 2: Volume-1 created in step (2) has a *different* encryption key ID
than the one associated with Image-1. Thus, even in the scenario where
Image-1 becomes non-decryptable, Volume-1 is not affected.
Workaround
==========
When cinder_encryption_key_id,cinder_encryption_key_deletion_policy are
added to the non_inheritable_image_properties setting in nova.conf, the
useless Image-2 created in step (6) above will not have the image
properties on it that enable Glance to delete the encryption key still
in use by Image-1. This does not, however, protect images for which
steps (4)-(6) have been performed before the deployment of this workaround.
The safest way to deal with images created before the workaround is
deployed is to remove the cinder_encryption_key_deletion_policy image
property from any image that has it (or to change its value to
'do_not_delete'). While it is possible to use other image properties to
identify images created by Nova as opposed to images created by Cinder,
this is not guaranteed to be reliable because image properties may have
been modified or removed by the image owner.
Proposed Longer-term Fixes
==========================
- In the Ussuri release, the unsupported action in step (4) above will
result in a 400 rather than an active yet unusable server. Hence it
will no longer be possible to create the image of the unusable server
that causes the issue. [3]
- Additionally, given that the image properties associated with cinder
encrypted volumes and image signature validation are specific to a
single image and should not be inherited by server snapshots under any
circumstances, in Ussuri these "absolutely non-inheritable image
properties" will no longer be required to appear in the
non_inheritable_image_properties configuration setting in order to
prevent them from being propagated to server snapshots. [4]
References
==========
[0]
https://docs.openstack.org/nova/train/configuration/config.html#DEFAULT.non_inheritable_image_properties
[1] https://docs.openstack.org/nova/train/configuration/sample-config.html
[2] https://bugs.launchpad.net/nova/+bug/1852106
[3] https://review.opendev.org/#/c/707738/
[4] https://review.opendev.org/#/c/708126/
More information about the openstack-discuss
mailing list