[operators][cinder][nova][glance] possible data loss situation (bug 1852106)

Brian Rosmaita rosmaita.fossdev at gmail.com
Tue Feb 18 22:32:38 UTC 2020

tl;dr: If you are running the OpenStack Train release, add the following 
to the [DEFAULT]non_inheritable_image_properties configuration option 
[0] in your nova.conf:

About Your nova.conf
- if you already have a value set for non_inheritable_image_properties,
   add the above to what you currently have

- if non_inheritable_image_properties is *not* set in your current
   nova.conf, you must set its value to include BOTH the above properties
   AND the default values (which you can see when you generate a sample
   nova configuration file [1])

NOTE: At a minimum, the non_inheritable_image_properties list should 
- the properties used for image-signature-validation (these are not 
transferrable from image to image):
   * img_signature_hash_method
   * img_signature
   * img_signature_key_type
   * img_signature_certificate_uuid
- the properties used to manage the keys for images of cinder encrypted 
   * cinder_encryption_key_id
   * cinder_encryption_key_deletion_policy
- review the documentation to determine whether you want to include the 
other default properties:
   * cache_in_nova
   * bittorrent

This issue is being tracked as Launchpad bug 1852106 [2].

This is probably a low-occurrence situation, because in order for the 
issue to occur, all of the following must happen:

(0) using the OpenStack Train release (or code from master (Ussuri 

(1) cinder_encryption_key_id and cinder_encryption_key_deletion_policy 
are NOT included in the non_inheritable_image_properties setting in 
nova.conf (which they aren't, by default)

(2) a user has created a volume of an encrypted volume-type in the Block 
Storage service (cinder).  Call this Volume-1

(3) using the Block Storage service, the user has uploaded the encrypted 
volume as an image to the Image service (glance).  Call this Image-1

(4) using the Compute service (nova), the user has attempted to directly 
boot a server from the image.  (Note: this is an unsupported action, the 
supported workflow is to use the image to boot-from-volume.)

(5) although an unsupported action, if a user does (4), it currently 
results in a server in status ACTIVE but which is unusable because the 
operating system can't be found

(6) using the Compute service, the user requests the createImage action 
on the unusable (yet ACTIVE) server, resulting in Image-2

(7) using the Image service, the user deletes Image-2 (which has 
inherited the cinder_encryption_key_* properties from Image-1) upon 
which the encryption key is deleted, thereby rendering Image-1 
non-decryptable so that it can no longer be used in the normal 
boot-from-volume workflow

NOTE 1: the cinder_encryption_key_deletion_policy image property was 
introduced in Train.  In pre-Train releases, deleting the useless 
Image-2 in step (7) does NOT result in encryption key deletion.

NOTE 2: Volume-1 created in step (2) has a *different* encryption key ID 
than the one associated with Image-1.  Thus, even in the scenario where 
Image-1 becomes non-decryptable, Volume-1 is not affected.

When cinder_encryption_key_id,cinder_encryption_key_deletion_policy are 
added to the non_inheritable_image_properties setting in nova.conf, the 
useless Image-2 created in step (6) above will not have the image 
properties on it that enable Glance to delete the encryption key still 
in use by Image-1.  This does not, however, protect images for which 
steps (4)-(6) have been performed before the deployment of this workaround.

The safest way to deal with images created before the workaround is 
deployed is to remove the cinder_encryption_key_deletion_policy image 
property from any image that has it (or to change its value to 
'do_not_delete').  While it is possible to use other image properties to 
identify images created by Nova as opposed to images created by Cinder, 
this is not guaranteed to be reliable because image properties may have 
been modified or removed by the image owner.

Proposed Longer-term Fixes
- In the  Ussuri release, the unsupported action in step (4) above will 
result in a 400 rather than an active yet unusable server.  Hence it 
will no longer be possible to create the image of the unusable server 
that causes the issue. [3]

- Additionally, given that the image properties associated with cinder 
encrypted volumes and image signature validation are specific to a 
single image and should not be inherited by server snapshots under any 
circumstances, in Ussuri these "absolutely non-inheritable image 
properties" will no longer be required to appear in the 
non_inheritable_image_properties configuration setting in order to 
prevent them from being propagated to server snapshots. [4]

[1] https://docs.openstack.org/nova/train/configuration/sample-config.html
[2] https://bugs.launchpad.net/nova/+bug/1852106
[3] https://review.opendev.org/#/c/707738/
[4] https://review.opendev.org/#/c/708126/

More information about the openstack-discuss mailing list