Dear Sean, thanks for sharing your thoughts and solution ideas on this. Sean Mooney wrote:
On 28/10/2025 14:10, Markus Hentsch wrote:
... - problem: "qemu-img convert" cannot stream to stdout (see https://blogs.igalia.com/berto/2025/07/15/converting-qemu-qcow2-images-direc... ); in the worst case we have to wait for the full 1TB to be decrypted, consuming almost twice the amount of disk space in the process, instead of being able to only read the payload header and abort
honestly i think inspecting the content of the image violates one of the primary usecase for this which is confidential computing where we do not trust the infra hence why we are encyprting the image.
so to me the only thing that feels reasonable for the luks image type is to assert that the final image has an a valid luks header but don't try and look inside for the presence fo a gtp partion table for example that would require decryption.
I do agree and this was also one of my thoughts but I did not explicitly mention it due to the following reason: You have to consider that due to the interaction between Glance and Barbican in terms of secret deletion on image deletion (like already implemented for Cinder-originated LUKS images) and secret consumer registration, Glance already has the Barbican secret ID (part of image metadata) and access to it (by inheriting the user's token RBAC during the requests). Hence, even if Glance decides not to decrypt the image at all, it still has the means to. In other words: compromising a Glance node still gives you all you need. If we decide not to decrypt it in Glance for image verification, at least it will not be lying around in unencrypted form directly on disk or RAM at any given time but that's just a small comfort given the bigger context. You still have to trust both Glance nodes as well as Nova comute nodes, even if it only effectively gets decrypted on the latter.
for luks in qcow with an embedded luks partition then we shoudl aslo assert the qcow headers and ensure that non of the problematic feature like datafile or backing file fail the saftey check but again we should not be decrypting the content of the image.
for qcow in luks that should not be supported.
if it was supported that would only be reasonable if the disk type was luks but then qemu should really reject that because we should be explicitly telling qemu that the file is in raw format in this case not qcow. i don't think we can really protect form this by default but i we can mitigate this in 2 ways.
one glance has a list of supported image type, admin can remove luks form that list.
nova could/should have a similar list fo image type each compute node will allow to be used. that can simiarly allow use to reject luks images
finally we coudl also have a polciy rule for luks image in nova/glance that default to member but could be restricted by an operator
that would allow them to restrict it to the service role so that luks image could only be created and uploaded by nova or cinder not an end user.
this is my preferred way to lock down the ability to create a malicious image by allowing operator to restrict creating luks images to services or admin.
my less preferred option would be to default the new luks policy for nova/glance to requrie the manager or the admin roles.
that woudl require the admin to use custom policy or grant enduse more permission then they likely shoudl have to use this feature. this to me feel like a featur that normal enduser shoudl be able to use out of the box so the default should be `member` IMO
So, if I understand you correctly, we can sufficiently check the qcow2+LUKS format by using the qemu tools to inspect its metadata and but cannot do the same for the raw LUKS format (as it is currently produced by Cinder) because it could contain anything and LUKS isn't able to tell us any details? Another thought regarding qcow2+LUKS: just theoretically, would it be possible to craft something like placing another (inner) qcow2 as the LUKS payload of the qcow2+LUKS image? I think Dan had something in mind that once the image has been decrypted, we cannot say 100% that the decrypted form won't hit another qemu tooling in some workflow later on, which could again trigger nasty things if the decrypted form is again some qcow2 stuff. Regarding the allowlist configuration and policy approaches: if the upload or usage of raw LUKS images was restricted to operator/admin users, we would render Cinder's upload-volume-to-image (i.e. "openstack image create --volume ...") unusable for regular users, because this exact format is currently produced by Cinder in such case and consumed if you create a volume based on such cinder-created image.
2) Nova integration: how to approach future interoperability with ephemeral storage encryption? 2a) can we handle this as separate future expansion to this feature, i.e., merging Glance+Cinder functionality short-term with only compatibility changes in Nova for now and addressing full Nova integration as a next step with dedicated patchsets based on the current work?
this work is currently paused so while i don't think we should do something intentional incompatible i don't think you should have
to go out of your way to explicitly support it provide you do not break the existing lvm backend supprot.
Understood. Best regards, Markus Hentsch -- Markus Hentsch DevOps Engineer Cloud&Heat Technologies GmbH Königsbrücker Straße 96 | 01099 Dresden +49 351 479 367 00 markus.hentsch@cloudandheat.com | www.cloudandheat.com Green, Open, Efficient. Ihr Cloud-Service- und Cloud-Technologie-Provider aus Dresden. https://www.cloudandheat.com/ Commercial Register: District Court Dresden Register Number: HRB 30549 VAT ID No.: DE281093504 Managing Director: Nicolas Röhrs Authorized signatory: Dr. Marius Feldmann