Re: [ptg] Image Encryption Session, Thursday 30th @ 16:00 UTC

30 Oct 2025

      Dear Sean, thanks for sharing your thoughts and solution ideas on this.

Sean Mooney wrote:
...
On 28/10/2025 14:10, Markus Hentsch wrote:
...
...
        - problem: "qemu-img convert" cannot stream to stdout (see 
https://blogs.igalia.com/berto/2025/07/15/converting-qemu-qcow2-images-direc... 
); in the worst case we have to wait for the full 1TB to be 
decrypted, consuming almost twice the amount of disk space in the 
process, instead of being able to only read the payload header and abort
honestly i think inspecting the content of the image violates one of 
the primary usecase for this which is confidential computing where we 
do not trust the infra hence why we are encyprting the image.
so to me the only thing that feels reasonable for the luks image type 
is to assert that the final image has an a valid luks header but don't 
try and look inside for the presence fo a gtp partion table for 
example that would require decryption.
I do agree and this was also one of my thoughts but I did not explicitly 
mention it due to the following reason:

You have to consider that due to the interaction between Glance and 
Barbican in terms of secret deletion on image deletion (like already 
implemented for Cinder-originated LUKS images) and secret consumer 
registration, Glance already has the Barbican secret ID (part of image 
metadata) and access to it (by inheriting the user's token RBAC during 
the requests). Hence, even if Glance decides not to decrypt the image at 
all, it still has the means to. In other words: compromising a Glance 
node still gives you all you need.
If we decide not to decrypt it in Glance for image verification, at 
least it will not be lying around in unencrypted form directly on disk 
or RAM at any given time but that's just a small comfort given the 
bigger context. You still have to trust both Glance nodes as well as 
Nova comute nodes, even if it only effectively gets decrypted on the latter.
...
for luks in qcow with an embedded luks partition then we shoudl aslo 
assert the qcow headers and ensure that non of the problematic feature 
like datafile or backing file fail the saftey check but again
we should not be decrypting the content of the image.
for qcow in luks that should not be supported.
if it was supported that would only be reasonable if the disk type was 
luks but then qemu should really reject that because we should be 
explicitly telling qemu that the file is in raw format in this case
not qcow. i don't think we can really protect form this by default but 
i we can mitigate this in 2 ways.
one glance has a list of supported image type, admin can remove luks 
form that list.
nova could/should have a similar list fo image type each compute node 
will allow to be used. that can simiarly allow use to reject luks images
finally we coudl also have a polciy rule for luks image in nova/glance 
that default to member but could be restricted by an operator
that would  allow them to restrict it to the service role so that luks 
image could only be created and uploaded by nova or cinder not an end 
user.
this is my preferred way to lock down the ability to create a 
malicious image by allowing operator to restrict creating luks images 
to services or admin.
my less preferred option would be to default the new luks policy for 
nova/glance to requrie the manager or the admin roles.
that woudl require the admin to use custom policy or grant enduse more 
permission then they likely shoudl have to use this feature.
this to me feel like a featur that normal enduser shoudl be able to 
use out of the box so the default should be `member` IMO
So, if I understand you correctly, we can sufficiently check the 
qcow2+LUKS format by using the qemu tools to inspect its metadata and 
but cannot do the same for the raw LUKS format (as it is currently 
produced by Cinder) because it could contain anything and LUKS isn't 
able to tell us any details?

Another thought regarding qcow2+LUKS: just theoretically, would it be 
possible to craft something like placing another (inner) qcow2 as the 
LUKS payload of the qcow2+LUKS image? I think Dan had something in mind 
that once the image has been decrypted, we cannot say 100% that the 
decrypted form won't hit another qemu tooling in some workflow later on, 
which could again trigger nasty things if the decrypted form is again 
some qcow2 stuff.

Regarding the allowlist configuration and policy approaches: if the 
upload or usage of raw LUKS images was restricted to operator/admin 
users, we would render Cinder's upload-volume-to-image (i.e. "openstack 
image create --volume ...") unusable for regular users, because this 
exact format is currently produced by Cinder in such case and consumed 
if you create a volume based on such cinder-created image.
...
...
2) Nova integration: how to approach future interoperability with 
ephemeral storage encryption?
    2a) can we handle this as separate future expansion to this 
feature, i.e., merging Glance+Cinder functionality short-term with 
only compatibility changes in Nova  for now and addressing full Nova 
integration as a next step with dedicated patchsets based on the 
current work?
this work is currently paused so while i don't think we should do 
something intentional incompatible i don't think you should have
to go out of your way to explicitly support it provide you do not 
break the existing lvm backend supprot.
Understood.

Best regards,

Markus Hentsch

-- 
Markus Hentsch
DevOps Engineer

Cloud&Heat Technologies GmbH
Königsbrücker Straße 96 | 01099 Dresden
+49 351 479 367 00
markus.hentsch@cloudandheat.com | www.cloudandheat.com

Green, Open, Efficient.
Ihr Cloud-Service- und Cloud-Technologie-Provider aus Dresden.
https://www.cloudandheat.com/

Commercial Register: District Court Dresden
Register Number: HRB 30549
VAT ID No.: DE281093504
Managing Director: Nicolas Röhrs
Authorized signatory: Dr. Marius Feldmann