Re: [ops][nova] How to update from N to N+3 (and manage nova services that don't start because they find too old compute nodes...)

10 Jan 2022

      ...
Dear all
When we upgraded our Cloud from Rocky to Train we followed the following
procedure:
1) Shutdown of all services on the controller and compute nodes
2) Update from Rocky to Stein of controller (just to do the dbsyncs)
3) Update from Stein to Train of controller
4) Update from Rocky to Train of compute nodes
We are trying to do the same to update from Train to Xena, but now there is
a problem because
 nova services on the controller node refuse to start since they find too
old compute nodes (this is indeed a new feature, properly documented in the
release notes).
As a workaround we had to manually modify the "version" field of the
compute nodes in the nova.services table.
Is it ok, or is there a cleaner way to manage the issue ?
On Mon, 2022-01-10 at 18:00 +0100, Massimo Sgaravatto wrote:
the check is mainly implemeented by 
https://github.com/openstack/nova/blob/0e0196d979cf1b8e63b9656358116a36f1f09...

i belive the intent was this shoudl only be an issue if the service report as up

so you should be able to do the following.
1 stop nova-compute on all nodes
2 wait for compute service to be down then stop contolers.
3 upgrade contoler directly to xena skiping all intermediary releases.
(the db sysncs have never needed to be done every release we keep the migration for many releases.
there also are no db change between train and wallaby and i dont think there are any in xena either)
4 upgrade the nova-compute on all compute nodes.

looking at the code however i dont think we ar checking the status of the services at all so it is an absolute check.

as a result you can nolonger do FFU which im surpised no on has complained about before.

this was implemented by https://github.com/openstack/nova/commit/aa7c6f87699ec1340bd446a7d47e1453847... in wallaby

just to be clear we have never actully support having active nova service wherne the version mix is greate then n+1
we just started enforceing that in wallaby
...
Thanks, Massimo

Re: [ops][nova] How to update from N to N+3 (and manage nova services that don't start because they find too old compute nodes...)

Sean Mooney