[ops][nova] How to update from N to N+3 (and manage nova services that don't start because they find too old compute nodes...)

Massimo Sgaravatto massimo.sgaravatto at gmail.com
Fri Feb 25 16:57:26 UTC 2022


Thanks
This "disable_compute_service_check_for_ffu" option is not available in
xena, correct ?
Cheers, Massimo

On Fri, Feb 25, 2022 at 5:15 PM Sean Mooney <smooney at redhat.com> wrote:

> On Fri, 2022-02-25 at 16:50 +0100, Massimo Sgaravatto wrote:
> > I had the chance to repeat this test
> > So the scenario is:
> >
> > 1) controller and compute nodes running train
> > 2) all services stopped in compute nodes
> > 3) controller updated: train-->ussuri-->victoria--> wallaby
> >
> > After that nova conductor and nova scheduler refuses to start [*]
>
> yes nova does not offially support n to n+3 upgrade
> we started enforcing that a few release ago.
> there is a workaround config option that we recently added that turns
> the error into a waring
> https://docs.openstack.org/nova/latest/configuration/config.html#workarounds.disable_compute_service_check_for_ffu
> that is one option or you can implement or before you upgrade the contoler
> you can force-down all the comptue nodes
>
>
>
> >
> > At that moment nova-compute services were not running on the compute
> nodes
> > And this was the status on the services table:
> >
> > mysql> select * from services where topic="compute";
> >
> +---------------------+---------------------+------------+----+-----------------------------+--------------+---------+--------------+----------+---------+-------------------------------------+---------------------+-------------+---------+--------------------------------------+
> > > created_at          | updated_at          | deleted_at | id | host
> >                  | binary       | topic   | report_count | disabled |
> > deleted | disabled_reason                     | last_seen_up        |
> > forced_down | version | uuid                                 |
> >
> +---------------------+---------------------+------------+----+-----------------------------+--------------+---------+--------------+----------+---------+-------------------------------------+---------------------+-------------+---------+--------------------------------------+
> > > 2018-01-11 17:20:34 | 2022-02-25 09:09:17 | NULL       | 17 |
> > compute-01.cloud.pd.infn.it | nova-compute | compute |     10250811 |
> >  1 |       0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25 09:09:13
> |
> >           0 |      40 | 2f56b8cf-1190-4999-af79-6bcee695c653 |
> > > 2018-01-11 17:26:39 | 2022-02-25 09:09:49 | NULL       | 23 |
> > compute-02.cloud.pd.infn.it | nova-compute | compute |     10439622 |
> >  1 |       0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25 09:09:49
> |
> >           0 |      40 | fbe37dfd-4a6c-4da1-96e0-407f7f98c4c4 |
> > > 2018-01-11 17:27:12 | 2022-02-25 09:10:02 | NULL       | 24 |
> > compute-03.cloud.pd.infn.it | nova-compute | compute |     10361295 |
> >  1 |       0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25 09:10:02
> |
> >           0 |      40 | 3675f324-81dd-445a-b4eb-510726104be3 |
> > > 2021-04-06 12:54:42 | 2022-02-25 09:10:02 | NULL       | 25 |
> > compute-04.cloud.pd.infn.it | nova-compute | compute |      1790955 |
> >  1 |       0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25 09:10:02
> |
> >           0 |      40 | e3e7af4d-b25b-410c-983e-8128a5e97219 |
> >
> +---------------------+---------------------+------------+----+-----------------------------+--------------+---------+--------------+----------+---------+-------------------------------------+---------------------+-------------+---------+--------------------------------------+
> > 4 rows in set (0.00 sec)
> >
> >
> >
> > Only after manually setting the version field of these entries to '54',
> > nova-conductor and nova-scheduler were able to start
> >
> > Regards, Massimo
> >
> >
> > [*]
> > 2022-02-25 15:06:03.992 591600 CRITICAL nova
> > [req-cc20f294-cced-434b-98cd-5bdf228a2a22 - - - - -] Unhandled error:
> > nova.exception.TooOldComputeService: Current Nova ve
> > rsion does not support computes older than Wallaby but the minimum
> compute
> > service level in your system is 40 and the oldest supported service level
> > is 54.
> > 2022-02-25 15:06:03.992 591600 ERROR nova Traceback (most recent call
> last):
> > 2022-02-25 15:06:03.992 591600 ERROR nova   File
> "/usr/bin/nova-conductor",
> > line 10, in <module>
> > 2022-02-25 15:06:03.992 591600 ERROR nova     sys.exit(main())
> > 2022-02-25 15:06:03.992 591600 ERROR nova   File
> > "/usr/lib/python3.6/site-packages/nova/cmd/conductor.py", line 46, in
> main
> > 2022-02-25 15:06:03.992 591600 ERROR nova     topic=rpcapi.RPC_TOPIC)
> > 2022-02-25 15:06:03.992 591600 ERROR nova   File
> > "/usr/lib/python3.6/site-packages/nova/service.py", line 264, in create
> > 2022-02-25 15:06:03.992 591600 ERROR nova
>  utils.raise_if_old_compute()
> > 2022-02-25 15:06:03.992 591600 ERROR nova   File
> > "/usr/lib/python3.6/site-packages/nova/utils.py", line 1098, in
> > raise_if_old_compute
> > 2022-02-25 15:06:03.992 591600 ERROR nova
> > oldest_supported_service=oldest_supported_service_level)
> > 2022-02-25 15:06:03.992 591600 ERROR nova
> > nova.exception.TooOldComputeService: Current Nova version does not
> support
> > computes older than Wallaby but the minimum comput
> > e service level in your system is 40 and the oldest supported service
> level
> > is 54.
> > 2022-02-25 15:06:03.992 591600 ERROR nova
> >
> >
> >
> > On Mon, Jan 10, 2022 at 6:00 PM Massimo Sgaravatto <
> > massimo.sgaravatto at gmail.com> wrote:
> >
> > > Dear all
> > >
> > > When we upgraded our Cloud from Rocky to Train we followed the
> following
> > > procedure:
> > >
> > > 1) Shutdown of all services on the controller and compute nodes
> > > 2) Update from Rocky to Stein of controller (just to do the dbsyncs)
> > > 3) Update from Stein to Train of controller
> > > 4) Update from Rocky to Train of compute nodes
> > >
> > > We are trying to do the same to update from Train to Xena, but now
> there
> > > is a problem because
> > >  nova services on the controller node refuse to start since they find
> too
> > > old compute nodes (this is indeed a new feature, properly documented
> in the
> > > release notes).
> > > As a workaround we had to manually modify the "version" field of the
> > > compute nodes in the nova.services table.
> > >
> > > Is it ok, or is there a cleaner way to manage the issue ?
> > >
> > > Thanks, Massimo
> > >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20220225/2cd26ede/attachment.htm>


More information about the openstack-discuss mailing list