[nova] Summary of discussions at Vancouver June 2023 (PTG + Forum meet&greet)
Thanks folks that were joining our PTG but also our meet&greet forum session ! IT was again a productive week for us, as we were able to discuss with a lot of operators during around 4 hours (3.30' for the PTG and 30 mins for the forum session) Below you'll find the summary of all the topics we discussed during those hours but you can also look at the etherpads directly : * Nova meet&greet https://etherpad.opendev.org/p/nova-vancouver2023-meet-and-greet * Nova PTG sessions https://etherpad.opendev.org/p/vancouver-june2023-nova While there were unfortunately very few Nova maintainers and developers in both rooms, we've seen a packed room for the meet&greet and then at every hour of the PTG, we got at least one operator wanting to discuss with us. Can't tell how much I love this, hopefully we'll continue to discuss it between us afterwards. Now, let me tell you the summary : === Short survey at the meet&greet === * most of the operators (~21) in the meet&greet were either running Train or Yoga. Some of them were already running 2023.1 Antelope. Wow. * all of them were running libvirt but one running ironic driver. * using virtual GPUs/mdevs and setting flavors to define CPU usage were the two most in-use features (the next one is SR-IOV) * accordingly, the most needed missing quotas are for PCI devices, obviously. === Pain points at the meet&greet === * Availability zones : Basically, some operators use massively AZs and they want to get off some AZ (even if the instance is pinned) for maintenance reasons. We continued to discuss this usecase at the PTG (see below) * Update of userdata : we said this was coming with a proposed spec * Filtering flavors by a resource : this would require a nova spec * Hard affinity problems : we continued to discuss this at the PTG (see below) * Ceph backend for Nova compute and need for local ephemeral storage (case of GPUs) : this should be discussed in a spec * attach/detach issues with BDMs : we said this is a known issue, some bug report has to be filled for proper triaging * state of CVEs and the fact that old releases are still impacted : this is known and relates to the state of Extended Maintenance releases, which was addressed by another Forum session. === The PTG === * Affinity/Anti-affinity migration problems if hard policy : we explained the consensus we had at the Bobcat vPTG and a Bloomberg operator kindly volunteered to capture this agreement in a backlog spec (tl,dr: allow operators to violate the policy and let the servergroup API show the violation). More to come hopefully soon. * Exposing metrics : Some operator explained how the current use of Prometheus exporter is super slow since it's using the Nova APIs for gathering usage metrics. We proposed him to look into using Placement APIs for such case, he'll test and eventually come back to us if the existing Placement APIs aren't viable for him. * memfd-backed memory backing per instance : there are multiple concerns that need to be addressed in a spec. For example, we need to ensure the flavor extraspec is driver-agnostic, we need to care about the defaults and the potential upgrade concerns (like changing from anonymous to memfd) and how to interact with the existing hugepages feature in Nova. * using virtio9p instead of virtiofs for Manila shares : we discussed it intensively to eventually come up to the conclusion this wasn't worth the effort to implement it. Let's revisit this decision by the next PTG if no progress has been made on virtiofs migration feature gaps. * Manila/Nova cross-project PTG discussion on Manila share support : we discussed on the next Manila features (lock API and use of service roles). We agreed on the mandatory Nova configuration for service roles as a requirement for Manila shares usage. We also agreed on Manila storing the instance UUID as the semaphore for the lock API. FWIW, there was a packed table with 7 operators (all public clouds) *REALLY* interested in this feature [1]. That's it for Nova-related bits at Vancouver but other topics were worth being discussed. You can find the list of all discussed Forum sessions here : https://wiki.openstack.org/wiki/Forum/Vanvouver2023 I've found particular interest in the ExtendedMaintainance potential abandon discussion, OSC and SDK sessions, live-migration usecases and problems and a couple of others. I'd urge you to glance at all etherpads' contents. [1] https://photos.app.goo.gl/t1tSyk67GG6TRRW59 HTH, -Sylvain
participants (1)
-
Sylvain Bauza