[kolla][operators] Critical regression in Wallaby
radoslaw.piliszek at gmail.com
Sat Aug 28 17:07:55 UTC 2021
Dear Operators of Kolla-based deployments,
There is a critical regression in current Kolla Ansible Wallaby code
that results in an environment that shuts down VMs on each libvirtd
container stop or restart on non-cgroupsv2 distros (so CentOS, Ubuntu
and Debian Buster but not Debian Bullseye). 
The fix is already available. 
Please apply it to your Kolla Ansible installation if you are using Wallaby.
Do note the fix only applies after redeploying which means
redeployment action will still trigger the buggy behaviour that once!
What to do if you have already deployed Wallaby?
First of all, make sure you don't accidentally take an action that
stops nova_libvirt (including restarts: both manual and those applied
by Kolla Ansible due to user-requested changes).
Please apply the patch above but don't rush with redeploying!
Redeploy each compute node separately (or in batches if you prefer) -
using --limit commandline parameter - and always make sure you have
first migrated relevant VMs out of the nodes that are going to get
This way you can safely fix an existing deployment.
We will be working on improving the testing to avoid such issues in the future.
Thanks to Ignazio Cassano for noticing and reporting the issue.
I have triaged and analysed it, proposing a fix afterwards.
More information about the openstack-discuss