[nova][ptg] 2025.1 Epoxy PTG summary
(resending the email as the previous one was blocked to an attached etherpad backup txtfile larger than the max size) Hey all, First, thanks for having joined us if you were in the vPTG. We had 15-20 people every day for our nova sessions, I was definitely happy to see new folks :-) If you want to see our PTG etherpad, please rather look at https://etherpad.opendev.org/p/r.4f297ee4698e02c16c4007f7ee76b7c1 instead of the main nova etherpad as I don't want that the etherpad would have a wrong traduction or having some paragraphs to be removed. As I say every cycle, just take a coffee (or a tea) now as the summary will be large. ### Dalmatian retrospective and Epoxy planning ### 6 of 15 approved blueprints were eventually implemented. We also merged more than 31 bugfixes during Dalmatian. We agreed to be explaining on the IRC channel when we have meetings for discussing some feature series (like the one we did every week for the manila/virtiofs series) and providing some public invitations. We could do this again this cycle for other features, we'll see. We will also try to have a periodic integration-compute job that pulls OSC and SDK from master. Our Epoxy deadlines will be : two spec review days (R-16, R-2), a soft spec approval freeze by R-16 and then hard spec approval freeze by R-12. That means that contributors really need to provide their specs before mid-December. Bauzas (me) will add these deadlines into the Epoxy schedule : https://releases.openstack.org/epoxy/schedule.html ### vTPM live migration ### We agreed on the fact that a vTPM live-migration feature is a priority for Epoxy given Windows 11. artom will create a spec proposing an image metadata property saying 'do I want to share my secret with nova service user ?' and also providing a new `nova-manage image_property set migratable_something` command so operators could migrate the existing instances for getting the Barbican secrets, if really the operators wants. ### Unified limits wrap-up ### We already have two changes needing to be merged before we can modify the default quota driver (in order to default to use unified limits). We agreed on reviewing both patches (one for treating unset limits as unlimited, the other about adding a nova-manage command for automatically creating nova limits) but we also discussed about a latter patch that would eventually say which nova resources need to be eventually set (so we *have to* enforce them anyway). melwitt agreed on working on that latter patch. ### per-process health checks ### We already had one series and we discussed it again. Gibi agreed on taking over it and he will re-propose the existing spec as it is. We also discussed the first checks we would have, like RPC failures and DB connection issues, we'll review those when they are in Gerrit. ### sustainable computing (a.k.a. power mgmt) ### When someone (I won't say who [1]) implemented power management in Antelope, this was nice but we eventually found a long list of bugs that we fixed. Since we don't really want to reproduce that experience, we had this kind of post-mortem where we eventually agreed on two things that could avoid reproducing that problem : a weekly periodic job will run whitebox tempest plugins [2] with nova-compute restarts also covered by a whitebox tempest plugin. Nobody is committed against those two actions but we hope to identify someone soon. As a side note, gibi mentioned RAPL MSR support [3], notifying us that we would have to support that in a later release (as the libvirt implementation is not merged yet) ### nvidia's vGPU vfio-pci variant driver support ### Long story short, since the linux kernel removed some feature in release 5.18 (IOMMU backend support for vfio-mdev) this impacted the nvidia driver which now detects that and then creates vfio-pci devices instead of vfio-mdev devices (as vGPUs). This has a dramatic impact on Nova as we relied on the vfio-mdev framework for abstracting virtual GPUs. By the next release, Nova will need to inventorize the GPUs by rather looking at SRIOV virtual functions which are specific to the nvidia driver (we call them vfio-pci variant driver resources). The nova PTG session focused on the required efforts to do so. We agreed on the fact it will require operators to propose different flavors for vGPU where they would require distinct resource classes (all but VGPU). Fortunately, we'll reuse existing device_spec PCI config options [4] where the operator would define custom resource classes which would match the PCI addresses of the nvidia-generated virtual functions (don't freak out, we'll also write documentation). We'll create another device type (something like type-VF-migratable) for describing such specific nvidia VFs. Accordingly the generated domain XML will correctly write the device description (amending the "managed=no" flag for that device). There will be an upgrade impact: existing instances will need to be resized to that new flavor (or instances will need to be shelved, updated for changing the embedded flavor and unshelved). In order to be on par with existing vGPU features, we'll also need to implement vfio-pci live-migration by detecting the VF type on the existing SRIOV live-migration. Since that effort is quite large, bauzas will incept a subteam of interested parties that would help him implement all of those bits in the short timeframe that is one upstream cycle. ### Graceful shutdowns ### A common pitfall that was told by tobian-urdin is when you want to stop nova-compute services. In general, before stopping the service, we should be sure that all RPC calls are done, which means we would no longer accept RPC calls after asking to stop the nova-compute and just awaiting the current calls to be done before stopping the service. For that, we need to create a backlog spec for discussing that design and we would also need to modify oslo.service for unsubscribing the RPC topics. Unfortunately, this cycle we won't have any contributor for working on it, but gibi could try to at least document this. ### horizon-nova x-p session ### We mostly discussed the Horizon feature gaps [5]. The first priority would be Horizon to use OpenStackSDK instead of novaclient, but then supporting all of the new Nova API microversions. Unfortunately, we are no sure that we could have Horizon contributors that could fix those, but if you're a contributor and you want to help Horizon to be better, maybe you could do this ? If so, please ping me. ### Ironic-nova x-p session ### We didn't really have topics for this x-p session. We just quickly discussed some points, like Graphical Console support. Nothing really worth noting, maybe just that it would be nice that we could have readonly graphical console. We were just happy to say that the ironic driver now works better thanks to some features that were merged last cycles. Kudos to those who did them. ### HPC/AI optimized hypervisor "slices" ### A large topic to explain, I'll try to keep it short. Basically, how Nova slices the NUMA affinity between guests is nice but hard for HPC usecases where sometimes you need to better explain how to slice the NUMA dependent devices depending on the various PCI topologies. Eventually, we agreed on some POC that johnthetubaguy could work on by trying to implement a specific virt driver that would do something different from the existing NUMA affinities. ### Cinder-nova x-p session ### Multiple topics were discussed there. First, abishop wanted to enhance cinder's retyping of in-use boot volumes which means that the Nova os-attachments API to get a new parameter. We said that he needs to create a new spec and we agreed on the fact that the cinder contributors need to discuss with QEMU folks to know about the qemu writes. We also discussed about a new nova spec which is about adding burst length support to Cinder QoS [6]. We said that we need to both (nova and cinder) review this spec. About left residues when detaching a volume, we also agreed on the fact this is not a security flaw and the fact that os-brick should delete them, not nova (even if nova need to ask os-brick to look at that, either by a periodic run or when attaching/detaching). whoami-rajat will provide a spec for it. ### Python 3.13 support ### We discussed a specific issue for py3.13, the fact that the crypt module is no longer in stlib for py3.13, which impacts nova due to some usage in nova.virt.disk.api module for passing an admin password for file injection. Given file injection is deprecated, we have three possibilities: either removing admin password file injection (or even file injection as a whole), adding the new separate crypt package in upper-constraints or using oslo_uitls.secretutils module. bauzas (me) will provide an email to openstack-discuss for asking operators whether they are OK with deprecating file injection or just admin password injection and then we'll see the direction. bauzas or sean-k-mooney will also try to have py3.13 non-voting jobs for unittests/functtests. ### Eventlet removal steps in Nova ### I won't explain why we need to remove eventlet, you already know, right ? We rather discussed about the details in our nova components, including nova-api, nova-compute and other nova services. We agreed on removing direct eventlet imports where possible, move nova entrypoints that don't use eventlet to separate modules that don't monkeypatch the stdlib, look at what we can do with all our scatter_gather methods which asynchronously calling cells DB for using threads instead and check whether those calls are blocking on DB (and not on the MQ side). Gibi will shepherd that effort and provide some audit on the eventlet usage in order to avoid any unexpected but unfortunate late discoveries. ### Libvirt image backend refactor ### If you like spaghettis, you should pay attention to the libvirt image backend code. Lots of assumptions and conditionals make any change to that module hard to be written and hard to be reviewed, leading to errorprone situations like the ones we had when fixing some recent CVEs. We all agreed on the quite urgent necessity to refactor that code and melwitt proposed a multi-stage effort about that. We agreed on the proposal for the first two steps with some comments, leading to future revisions of the proposal's patches. The crucial bits with the refactor are about test coverage. ### IOThreads tuning for libvirt instances ### An old spec was already proposed for defining iothreads to guests. We agreed on reviving that spec, where a config option would define either no iothread or one iothread per instance (with a potential for a latter option value to be "one iothread per disk"). Depending on whether emulator_thread_policy is provided in the flavor/image, we would set the iothread on that policy or we would put the iothread floating over the shared CPU set. If no shared CPUs are configured but the operator wants iothreads, nova-compute would refuse to start. lajoskatona will work on such an implementation that will be designed in a blueprint that doesn't require a spec. ### OpenAPI schemas progress ### Nothing specific to say here, bauzas and gmann will review the series this cycle. That's it. I'm gone, I'm dead [7] (a cyclist metaphor) but I eventually skimmed the very large nova etherpad. Of course, 99% of chances that I'll write some notes incorrectly, so please correct if I'm wrong, I won't feel offended, just tired. Thanks all (and I hope your coffee or tea was good) -Sylvain [1] https://geek-and-poke.com/geekandpoke/2013/11/24/simply-explained [2] https://opendev.org/openstack/whitebox-tempest-plugin [3] https://www.qemu.org/docs/master/specs/rapl-msr.html [4] https://docs.openstack.org/nova/latest/configuration/config.html#pci.device_... [5] https://etherpad.opendev.org/p/horizon-feature-gap#L69 [6] https://review.opendev.org/c/openstack/nova-specs/+/932653 [7] https://www.youtube.com/watch?v=HILcYXf8yqc
participants (1)
-
Sylvain Bauza