- The Cinder team agreed on using its volume metadata API for tracking the cinder volumes hardware models.
### Horizon/Nova ###
Horizon wanted to call the Placement API. We agreed on using the openstackSDK for this as Placement doesn't have python bindings directly. Some new methods may be added against the SDK for calling specific Placement APIs if we think we may need those (instead of directly calling Placement with the standard HTTP verbs and the API URIs)
### Nova-specific topics ###
# Antelope retrospective and process-related discussions
Shorter cycle than Zed, 6 features landed (same as Zed), 46 contributors (down by 8 compared to Zed), 27 bugfixes (vs. 54 for Zed).
We were happy to not have a RC2 for the first time in the Nova history. The bug triage backlog kept small.
- We want to discuss on a better review cadence to not look at all the implementations by the last weeks.
- We want to get rid of Storyboard stories in Placement. We found and agreed on a way forward.
- We will set 5-years-old-and-more bug reports as Incomplete, so bug reporters have 90 days to confirm that the bug is still present on master.
- We will add more contrib documentation for explaining more how to find Gerrit changes for one-off reviewers and how to create some changes
- We agreed on no longer automatically accepting an already-approved spec from a previous cycle if no code is still not existing.
- All efforts on porting our client calls to the SDK should be tagged in Launchpad with the 'sdk' tag.
- A Bobcat schedule will be proposed on next Tuesday weekly meeting with feature and spec deadlines + some review days.
- We also discussed what we should discuss in the Vancouver PTG.
# CI failures continuous improvement
- We want to test some alternative image but cirros (alpine as a potential alternative) with a small footprint in some specific nova jobs
- We need to continue investigating on the volume detach/attach failures. We may create a specific job that would do serialized volume checks as a canary job for forcing the job to fail more frequently in order to dig down further.
- (related) Sylvain will propose a Vancouver PTG cross-project session for a CI debugging war room experience.
# The new release cadence
We had long arguments whether we should hold deprecations and removals this cycle, given the releasenotes tooling isn't in place yet. As a reminder, since Bobcat is a non-SLURP release, operators are able to skip it and just look at the C releasenotes so we want to make sure we forward-port upgrade notes for them. For the moment, no consensus in place, we defer the outcome into a subsequent nova meeting and in the meantime, we raised the point to the TC for guidance needs.
# The problem with hard affinity group policies
- We agreed on the fact that hard affinity/anti-affinity policies aren't ideal (operators, if you read me, please prefer soft affinity/anti-affinity policies for various reasons I won't explain here) but since we support those policies and the use case is quite understandable, we need to find a way to unblock instances that can't be easily-migrated from those kind of groups (eg. try to move your instances off a host with hard Affinity between them, good luck)
- A consensus has been reached to propose a new parameter for live-migrate, evacuate and migrate that would only skip the hard affinity checks. A backlog spec will be written capturing this consensus.
- Since a group policy may be violated, we'll also propose a new server group API modification that would show if a group has its policy violated.
# Unified limits next steps
- We agreed on the fact we should move on and enable by default unified limits in a close future, but we first need to provide a bit of tooling to help the migration.
- tool could be a nova-manage command that would provide the existing quota limits as a yaml file so we could inject those into keystone (to be further defined during Bobcat)
- a nova-status upgrade check may be nice for yelling if operators forgot to define their limits in keystone before upgrading to the release (potentially C) that would default to unified limits
- testing in the grenade job is a feat.
# Frequently written objects may lead to exhaustion of DB primary keys
- We agreed on having all new primary keys to have BigInteger as data type. Existing keys should use BigInt but there is a huge upgrade impact to mitigate.
- Operators will be sollicited to know their preferences between a costly-at-once data migration vs. a rolling data online migration for existing tables.
- We will document which DB tables are safe to reindex (since their PKs isn't normalized as a FK somewhere else)
- A backlog spec will capture all of the above
# Leftover volumes relationship when you directly call Cinder instead of Nova for detaching
- Yup, this is a known issue and we should discuss this with the Cinder team (unfortunately, we didn't did it this week)
- We maybe should have a nova-manage script that would cleanup the BDMs but we're afraid of leaking some volume residues in the compute OS
- Anyway, we need to understand the main scope and we also need to discuss about the preconditions ('when' should we call this tool and 'what' this tool should be doing ?). Probably a spec.
# Misc but not the leasc (heh)
- we will fill a rfe bug for lazy-loading instance name from system metadata or update the nested field in the instance object (no additional DB write)
- we will add a new policy for cold-migrate specifically when a host is passed as a parameter (admin-only as a default)
- We agreed on continuing the effort on compute node hostname robustification (the Bobcat spec seems not controversial)
- We agreed on a few things about the server show command and the use of the SDK. Some existing OSC patch may require a bit of rework but it would cover most of the concerns we discussed.
- we should fix the post-copy issue when you want to live-migrate a paused instance by removing the post-copy migration flag if so.
- we could enable virtio-blk trim support by default. It's both a bugfix (we want to remove an exception) and a new feature (we want to enable it by default) so we'll discuss whether we need a specless blueprint by a next meeting.
- we also discuss about a Generic VDPA support feature for Nova, and we asked the owner to provide us a spec for explaining the usecase.
### That's it. ###
Yet again, I'm impressed. If you read that phrase, you're done with reading. I just hope your coffee was good and the time you took reading that email was worth it. If you have questions or want to reach me directly on IRC, that's simple : I'm bauzas on #openstack-nova.
HTH,
-Sylvain (on behalf of the whole Nova community)
[2] Unfortunately, I'm very good at creating bugs as I discovered that based on my 10-year unfortunate contributions to OpenStack.
[3] Or a tea, or whatever you prefer.