[ops][nova] How to update from N to N+3 (and manage nova services that don't start because they find too old compute nodes...)

Balazs Gibizer balazs.gibizer at est.tech
Mon Feb 28 11:14:54 UTC 2022



On Fri, Feb 25 2022 at 05:57:26 PM +0100, Massimo Sgaravatto 
<massimo.sgaravatto at gmail.com> wrote:
> Thanks
> This "disable_compute_service_check_for_ffu" option is not available 
> in xena, correct ?

Not yet. But now I've proposed the backport of that fix to 
stable/xena[1]

Cheers,
gibi

[1] https://review.opendev.org/c/openstack/nova/+/831174

> Cheers, Massimo
> 
> On Fri, Feb 25, 2022 at 5:15 PM Sean Mooney <smooney at redhat.com> 
> wrote:
>> On Fri, 2022-02-25 at 16:50 +0100, Massimo Sgaravatto wrote:
>>  > I had the chance to repeat this test
>>  > So the scenario is:
>>  >
>>  > 1) controller and compute nodes running train
>>  > 2) all services stopped in compute nodes
>>  > 3) controller updated: train-->ussuri-->victoria--> wallaby
>>  >
>>  > After that nova conductor and nova scheduler refuses to start [*]
>> 
>>  yes nova does not offially support n to n+3 upgrade
>>  we started enforcing that a few release ago.
>>  there is a workaround config option that we recently added that 
>> turns
>>  the error into a waring 
>> https://docs.openstack.org/nova/latest/configuration/config.html#workarounds.disable_compute_service_check_for_ffu
>>  that is one option or you can implement or before you upgrade the 
>> contoler you can force-down all the comptue nodes
>> 
>> 
>> 
>>  >
>>  > At that moment nova-compute services were not running on the 
>> compute nodes
>>  > And this was the status on the services table:
>>  >
>>  > mysql> select * from services where topic="compute";
>>  > 
>> +---------------------+---------------------+------------+----+-----------------------------+--------------+---------+--------------+----------+---------+-------------------------------------+---------------------+-------------+---------+--------------------------------------+
>>  > > created_at          | updated_at          | deleted_at | id | 
>> host
>>  >                  | binary       | topic   | report_count | 
>> disabled |
>>  > deleted | disabled_reason                     | last_seen_up      
>>   |
>>  > forced_down | version | uuid                                 |
>>  > 
>> +---------------------+---------------------+------------+----+-----------------------------+--------------+---------+--------------+----------+---------+-------------------------------------+---------------------+-------------+---------+--------------------------------------+
>>  > > 2018-01-11 17:20:34 | 2022-02-25 09:09:17 | NULL       | 17 |
>>  > compute-01.cloud.pd.infn.it | nova-compute | compute |     
>> 10250811 |
>>  >  1 |       0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25 
>> 09:09:13 |
>>  >           0 |      40 | 2f56b8cf-1190-4999-af79-6bcee695c653 |
>>  > > 2018-01-11 17:26:39 | 2022-02-25 09:09:49 | NULL       | 23 |
>>  > compute-02.cloud.pd.infn.it | nova-compute | compute |     
>> 10439622 |
>>  >  1 |       0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25 
>> 09:09:49 |
>>  >           0 |      40 | fbe37dfd-4a6c-4da1-96e0-407f7f98c4c4 |
>>  > > 2018-01-11 17:27:12 | 2022-02-25 09:10:02 | NULL       | 24 |
>>  > compute-03.cloud.pd.infn.it | nova-compute | compute |     
>> 10361295 |
>>  >  1 |       0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25 
>> 09:10:02 |
>>  >           0 |      40 | 3675f324-81dd-445a-b4eb-510726104be3 |
>>  > > 2021-04-06 12:54:42 | 2022-02-25 09:10:02 | NULL       | 25 |
>>  > compute-04.cloud.pd.infn.it | nova-compute | compute |      
>> 1790955 |
>>  >  1 |       0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25 
>> 09:10:02 |
>>  >           0 |      40 | e3e7af4d-b25b-410c-983e-8128a5e97219 |
>>  > 
>> +---------------------+---------------------+------------+----+-----------------------------+--------------+---------+--------------+----------+---------+-------------------------------------+---------------------+-------------+---------+--------------------------------------+
>>  > 4 rows in set (0.00 sec)
>>  >
>>  >
>>  >
>>  > Only after manually setting the version field of these entries to 
>> '54',
>>  > nova-conductor and nova-scheduler were able to start
>>  >
>>  > Regards, Massimo
>>  >
>>  >
>>  > [*]
>>  > 2022-02-25 15:06:03.992 591600 CRITICAL nova
>>  > [req-cc20f294-cced-434b-98cd-5bdf228a2a22 - - - - -] Unhandled 
>> error:
>>  > nova.exception.TooOldComputeService: Current Nova ve
>>  > rsion does not support computes older than Wallaby but the 
>> minimum compute
>>  > service level in your system is 40 and the oldest supported 
>> service level
>>  > is 54.
>>  > 2022-02-25 15:06:03.992 591600 ERROR nova Traceback (most recent 
>> call last):
>>  > 2022-02-25 15:06:03.992 591600 ERROR nova   File 
>> "/usr/bin/nova-conductor",
>>  > line 10, in <module>
>>  > 2022-02-25 15:06:03.992 591600 ERROR nova     sys.exit(main())
>>  > 2022-02-25 15:06:03.992 591600 ERROR nova   File
>>  > "/usr/lib/python3.6/site-packages/nova/cmd/conductor.py", line 
>> 46, in main
>>  > 2022-02-25 15:06:03.992 591600 ERROR nova     
>> topic=rpcapi.RPC_TOPIC)
>>  > 2022-02-25 15:06:03.992 591600 ERROR nova   File
>>  > "/usr/lib/python3.6/site-packages/nova/service.py", line 264, in 
>> create
>>  > 2022-02-25 15:06:03.992 591600 ERROR nova     
>> utils.raise_if_old_compute()
>>  > 2022-02-25 15:06:03.992 591600 ERROR nova   File
>>  > "/usr/lib/python3.6/site-packages/nova/utils.py", line 1098, in
>>  > raise_if_old_compute
>>  > 2022-02-25 15:06:03.992 591600 ERROR nova
>>  > oldest_supported_service=oldest_supported_service_level)
>>  > 2022-02-25 15:06:03.992 591600 ERROR nova
>>  > nova.exception.TooOldComputeService: Current Nova version does 
>> not support
>>  > computes older than Wallaby but the minimum comput
>>  > e service level in your system is 40 and the oldest supported 
>> service level
>>  > is 54.
>>  > 2022-02-25 15:06:03.992 591600 ERROR nova
>>  >
>>  >
>>  >
>>  > On Mon, Jan 10, 2022 at 6:00 PM Massimo Sgaravatto <
>>  > massimo.sgaravatto at gmail.com> wrote:
>>  >
>>  > > Dear all
>>  > >
>>  > > When we upgraded our Cloud from Rocky to Train we followed the 
>> following
>>  > > procedure:
>>  > >
>>  > > 1) Shutdown of all services on the controller and compute nodes
>>  > > 2) Update from Rocky to Stein of controller (just to do the 
>> dbsyncs)
>>  > > 3) Update from Stein to Train of controller
>>  > > 4) Update from Rocky to Train of compute nodes
>>  > >
>>  > > We are trying to do the same to update from Train to Xena, but 
>> now there
>>  > > is a problem because
>>  > >  nova services on the controller node refuse to start since 
>> they find too
>>  > > old compute nodes (this is indeed a new feature, properly 
>> documented in the
>>  > > release notes).
>>  > > As a workaround we had to manually modify the "version" field 
>> of the
>>  > > compute nodes in the nova.services table.
>>  > >
>>  > > Is it ok, or is there a cleaner way to manage the issue ?
>>  > >
>>  > > Thanks, Massimo
>>  > >
>> 





More information about the openstack-discuss mailing list