[ops][nova] How to update from N to N+3 (and manage nova services that don't start because they find too old compute nodes...)
Balazs Gibizer
balazs.gibizer at est.tech
Mon Feb 28 11:14:54 UTC 2022
On Fri, Feb 25 2022 at 05:57:26 PM +0100, Massimo Sgaravatto
<massimo.sgaravatto at gmail.com> wrote:
> Thanks
> This "disable_compute_service_check_for_ffu" option is not available
> in xena, correct ?
Not yet. But now I've proposed the backport of that fix to
stable/xena[1]
Cheers,
gibi
[1] https://review.opendev.org/c/openstack/nova/+/831174
> Cheers, Massimo
>
> On Fri, Feb 25, 2022 at 5:15 PM Sean Mooney <smooney at redhat.com>
> wrote:
>> On Fri, 2022-02-25 at 16:50 +0100, Massimo Sgaravatto wrote:
>> > I had the chance to repeat this test
>> > So the scenario is:
>> >
>> > 1) controller and compute nodes running train
>> > 2) all services stopped in compute nodes
>> > 3) controller updated: train-->ussuri-->victoria--> wallaby
>> >
>> > After that nova conductor and nova scheduler refuses to start [*]
>>
>> yes nova does not offially support n to n+3 upgrade
>> we started enforcing that a few release ago.
>> there is a workaround config option that we recently added that
>> turns
>> the error into a waring
>> https://docs.openstack.org/nova/latest/configuration/config.html#workarounds.disable_compute_service_check_for_ffu
>> that is one option or you can implement or before you upgrade the
>> contoler you can force-down all the comptue nodes
>>
>>
>>
>> >
>> > At that moment nova-compute services were not running on the
>> compute nodes
>> > And this was the status on the services table:
>> >
>> > mysql> select * from services where topic="compute";
>> >
>> +---------------------+---------------------+------------+----+-----------------------------+--------------+---------+--------------+----------+---------+-------------------------------------+---------------------+-------------+---------+--------------------------------------+
>> > > created_at | updated_at | deleted_at | id |
>> host
>> > | binary | topic | report_count |
>> disabled |
>> > deleted | disabled_reason | last_seen_up
>> |
>> > forced_down | version | uuid |
>> >
>> +---------------------+---------------------+------------+----+-----------------------------+--------------+---------+--------------+----------+---------+-------------------------------------+---------------------+-------------+---------+--------------------------------------+
>> > > 2018-01-11 17:20:34 | 2022-02-25 09:09:17 | NULL | 17 |
>> > compute-01.cloud.pd.infn.it | nova-compute | compute |
>> 10250811 |
>> > 1 | 0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25
>> 09:09:13 |
>> > 0 | 40 | 2f56b8cf-1190-4999-af79-6bcee695c653 |
>> > > 2018-01-11 17:26:39 | 2022-02-25 09:09:49 | NULL | 23 |
>> > compute-02.cloud.pd.infn.it | nova-compute | compute |
>> 10439622 |
>> > 1 | 0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25
>> 09:09:49 |
>> > 0 | 40 | fbe37dfd-4a6c-4da1-96e0-407f7f98c4c4 |
>> > > 2018-01-11 17:27:12 | 2022-02-25 09:10:02 | NULL | 24 |
>> > compute-03.cloud.pd.infn.it | nova-compute | compute |
>> 10361295 |
>> > 1 | 0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25
>> 09:10:02 |
>> > 0 | 40 | 3675f324-81dd-445a-b4eb-510726104be3 |
>> > > 2021-04-06 12:54:42 | 2022-02-25 09:10:02 | NULL | 25 |
>> > compute-04.cloud.pd.infn.it | nova-compute | compute |
>> 1790955 |
>> > 1 | 0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25
>> 09:10:02 |
>> > 0 | 40 | e3e7af4d-b25b-410c-983e-8128a5e97219 |
>> >
>> +---------------------+---------------------+------------+----+-----------------------------+--------------+---------+--------------+----------+---------+-------------------------------------+---------------------+-------------+---------+--------------------------------------+
>> > 4 rows in set (0.00 sec)
>> >
>> >
>> >
>> > Only after manually setting the version field of these entries to
>> '54',
>> > nova-conductor and nova-scheduler were able to start
>> >
>> > Regards, Massimo
>> >
>> >
>> > [*]
>> > 2022-02-25 15:06:03.992 591600 CRITICAL nova
>> > [req-cc20f294-cced-434b-98cd-5bdf228a2a22 - - - - -] Unhandled
>> error:
>> > nova.exception.TooOldComputeService: Current Nova ve
>> > rsion does not support computes older than Wallaby but the
>> minimum compute
>> > service level in your system is 40 and the oldest supported
>> service level
>> > is 54.
>> > 2022-02-25 15:06:03.992 591600 ERROR nova Traceback (most recent
>> call last):
>> > 2022-02-25 15:06:03.992 591600 ERROR nova File
>> "/usr/bin/nova-conductor",
>> > line 10, in <module>
>> > 2022-02-25 15:06:03.992 591600 ERROR nova sys.exit(main())
>> > 2022-02-25 15:06:03.992 591600 ERROR nova File
>> > "/usr/lib/python3.6/site-packages/nova/cmd/conductor.py", line
>> 46, in main
>> > 2022-02-25 15:06:03.992 591600 ERROR nova
>> topic=rpcapi.RPC_TOPIC)
>> > 2022-02-25 15:06:03.992 591600 ERROR nova File
>> > "/usr/lib/python3.6/site-packages/nova/service.py", line 264, in
>> create
>> > 2022-02-25 15:06:03.992 591600 ERROR nova
>> utils.raise_if_old_compute()
>> > 2022-02-25 15:06:03.992 591600 ERROR nova File
>> > "/usr/lib/python3.6/site-packages/nova/utils.py", line 1098, in
>> > raise_if_old_compute
>> > 2022-02-25 15:06:03.992 591600 ERROR nova
>> > oldest_supported_service=oldest_supported_service_level)
>> > 2022-02-25 15:06:03.992 591600 ERROR nova
>> > nova.exception.TooOldComputeService: Current Nova version does
>> not support
>> > computes older than Wallaby but the minimum comput
>> > e service level in your system is 40 and the oldest supported
>> service level
>> > is 54.
>> > 2022-02-25 15:06:03.992 591600 ERROR nova
>> >
>> >
>> >
>> > On Mon, Jan 10, 2022 at 6:00 PM Massimo Sgaravatto <
>> > massimo.sgaravatto at gmail.com> wrote:
>> >
>> > > Dear all
>> > >
>> > > When we upgraded our Cloud from Rocky to Train we followed the
>> following
>> > > procedure:
>> > >
>> > > 1) Shutdown of all services on the controller and compute nodes
>> > > 2) Update from Rocky to Stein of controller (just to do the
>> dbsyncs)
>> > > 3) Update from Stein to Train of controller
>> > > 4) Update from Rocky to Train of compute nodes
>> > >
>> > > We are trying to do the same to update from Train to Xena, but
>> now there
>> > > is a problem because
>> > > nova services on the controller node refuse to start since
>> they find too
>> > > old compute nodes (this is indeed a new feature, properly
>> documented in the
>> > > release notes).
>> > > As a workaround we had to manually modify the "version" field
>> of the
>> > > compute nodes in the nova.services table.
>> > >
>> > > Is it ok, or is there a cleaner way to manage the issue ?
>> > >
>> > > Thanks, Massimo
>> > >
>>
More information about the openstack-discuss
mailing list