On 7/27/24 00:03, Dan Smith wrote:
There are some active deployment activity in this area to implement secured vTPM which allows measured boot without direct kernel boot, but this work is still under very active development. hum we should discuss this future but im not sure if we should proceed with SEV-SNP enablement before that is completed.
with that said if the request for SEV-SNP is done via a trait on the image combined with the exisitng image property for memory encycpition it may be workable given the direct kernel boot functionality is also expressed on the image.
that woudl be a pretty big limitation however for that feature.
Agreed, and I certainly would be hesitant to add more stuff dependent on something I'm hoping to remove. The UX for a feature where the kernel and ramdisk updated in the guest by the distro isn't actually what gets used it not a very good workflow at all.
Direct kernel boot is not mandatory to use SEV-SNP. Measured boot by direct kernel boot is an optional feature for users who want very strict attestation of the sofrware in their VM, so we should probably discuss it separately from the base SEV-SNP support. However we've had discussions about potential use cases of SEV-SNP internally and have learned that there are actually some use cases where strict attestation is required with the tradeoff in UX. I understand the limitation and tricky UX of the feature (I learned these while I did some PoC work) but these would be still acceptable for users who has very strict requirement to protect their data, including their applications or software, on cloud. As I said there are some active development works in this area, but these are still in early phase and it may take some time (at least one year or even a few years) until these are implemented in upstream kernel and a few others. Also at this stage it's not clear how much additional work we need to actually integrate these works to cloud use case. So I really hope that we can start with the not-the-best but working solution first to realize strict attestations and then consider replacing it by a better functionality in a future.
That said, I guess honoring the kernel/ramdisk linkage from whatever image is selected is perhaps something we could do with less complication than we currently have. Right now, we have places where booting from an AMI changes various behaviors and that's definitely the primary thing I want to remove. Just honoring the kernel/ramdisk linkage without other special image behaviors is maybe (*maybe*) less concerning although I still think it would be better to eliminate that if we can.
Disclaimer: I'm not a security expert or a QEMU expert so it'd be nice to hear opinion from someone more familiar with these My understanding that the image format CVE we are mainly discussing here is caused by the way how QEMU (specifically qemu-img) handles image format. However for kernel image and ramdisk image these are treated as pure "raw" image without any conversion or parsing method in both nova layer as well as QEMU layer (when these are associated with kernel/ramdisk_id, not by AMI), so it may have relatively lower risks to keep the current handling of these images My current view after quick dig into the QEMU implementation is that the direct kernel boot implementation in QEMU extract kernel data and ram data into guest memory area without any format parsing so I expect impact of malformed images may be limited to the instances which are launched with these images basically.
im proxying some of the converstaion i have had with dan on this topic but for exmaple the root disk image cirros-0.6.2-x86_64-blank.img shoudl really be disk-format=GPT i.e. declaring to glance that this iamge contains a gpt partition table
I definitely don't want to add cpio (and etc) to glance as disk_format options just because the kernel or ramdisk image may be encoded that way. However, if we can get to the point where nova will boot a disk_format=gpt but *not* a disk_format=raw, then raw can become "a non-bootable binary blob used for other purposes" ... which could be a kernel, ramdisk, or anything else.
--Dan