[ops][nova] How to update from N to N+3 (and manage nova services that don't start because they find too old compute nodes...)
Dear all
When we upgraded our Cloud from Rocky to Train we followed the following procedure:
1) Shutdown of all services on the controller and compute nodes 2) Update from Rocky to Stein of controller (just to do the dbsyncs) 3) Update from Stein to Train of controller 4) Update from Rocky to Train of compute nodes
We are trying to do the same to update from Train to Xena, but now there is a problem because nova services on the controller node refuse to start since they find too old compute nodes (this is indeed a new feature, properly documented in the release notes). As a workaround we had to manually modify the "version" field of the compute nodes in the nova.services table.
Is it ok, or is there a cleaner way to manage the issue ?
Thanks, Massimo
We are trying to do the same to update from Train to Xena, but now there is a problem because nova services on the controller node refuse to start since they find too old compute nodes (this is indeed a new feature, properly documented in the release notes). As a workaround we had to manually modify the "version" field of the compute nodes in the nova.services table.
Is it ok, or is there a cleaner way to manage the issue ?
I think this is an unintended consequence of the new check. Can you file a bug against nova and report the number here? We probably need to do something here...
Thanks!
--Dan
On Mon, 2022-01-10 at 18:00 +0100, Massimo Sgaravatto wrote:
Dear all
When we upgraded our Cloud from Rocky to Train we followed the following procedure:
- Shutdown of all services on the controller and compute nodes
- Update from Rocky to Stein of controller (just to do the dbsyncs)
- Update from Stein to Train of controller
- Update from Rocky to Train of compute nodes
We are trying to do the same to update from Train to Xena, but now there is a problem because nova services on the controller node refuse to start since they find too old compute nodes (this is indeed a new feature, properly documented in the release notes). As a workaround we had to manually modify the "version" field of the compute nodes in the nova.services table.
Is it ok, or is there a cleaner way to manage the issue ?
the check is mainly implemeented by https://github.com/openstack/nova/blob/0e0196d979cf1b8e63b9656358116a36f1f09...
i belive the intent was this shoudl only be an issue if the service report as up
so you should be able to do the following. 1 stop nova-compute on all nodes 2 wait for compute service to be down then stop contolers. 3 upgrade contoler directly to xena skiping all intermediary releases. (the db sysncs have never needed to be done every release we keep the migration for many releases. there also are no db change between train and wallaby and i dont think there are any in xena either) 4 upgrade the nova-compute on all compute nodes.
looking at the code however i dont think we ar checking the status of the services at all so it is an absolute check.
as a result you can nolonger do FFU which im surpised no on has complained about before.
this was implemented by https://github.com/openstack/nova/commit/aa7c6f87699ec1340bd446a7d47e1453847... in wallaby
just to be clear we have never actully support having active nova service wherne the version mix is greate then n+1 we just started enforceing that in wallaby
Thanks, Massimo
Good to know that it is not necessary for nova to go through ALL intermediate releases and perform db-sync The question is if this is true for ALL openstack services (in our deployment the controller node is used for all services and not only for nova)
Thanks, Massimo
On Mon, Jan 10, 2022 at 7:03 PM Sean Mooney smooney@redhat.com wrote:
On Mon, 2022-01-10 at 18:00 +0100, Massimo Sgaravatto wrote:
Dear all
When we upgraded our Cloud from Rocky to Train we followed the following procedure:
- Shutdown of all services on the controller and compute nodes
- Update from Rocky to Stein of controller (just to do the dbsyncs)
- Update from Stein to Train of controller
- Update from Rocky to Train of compute nodes
We are trying to do the same to update from Train to Xena, but now there
is
a problem because nova services on the controller node refuse to start since they find too old compute nodes (this is indeed a new feature, properly documented in
the
release notes). As a workaround we had to manually modify the "version" field of the compute nodes in the nova.services table.
Is it ok, or is there a cleaner way to manage the issue ?
the check is mainly implemeented by
https://github.com/openstack/nova/blob/0e0196d979cf1b8e63b9656358116a36f1f09...
i belive the intent was this shoudl only be an issue if the service report as up
so you should be able to do the following. 1 stop nova-compute on all nodes 2 wait for compute service to be down then stop contolers. 3 upgrade contoler directly to xena skiping all intermediary releases. (the db sysncs have never needed to be done every release we keep the migration for many releases. there also are no db change between train and wallaby and i dont think there are any in xena either) 4 upgrade the nova-compute on all compute nodes.
looking at the code however i dont think we ar checking the status of the services at all so it is an absolute check.
as a result you can nolonger do FFU which im surpised no on has complained about before.
this was implemented by https://github.com/openstack/nova/commit/aa7c6f87699ec1340bd446a7d47e1453847... in wallaby
just to be clear we have never actully support having active nova service wherne the version mix is greate then n+1 we just started enforceing that in wallaby
Thanks, Massimo
Good to know that it is not necessary for nova to go through ALL intermediate releases and perform db-sync The question is if this is true for ALL openstack services (in our deployment the controller node is used for all services and not only for nova)
Actually, Sean is wrong here - we do expect you to go through each release on the controller, it's just that it's rare that it's actually a problem. We have had blocker migrations at times in the past where we have had to ensure that data is migrated before changing or dropping items of schema. We also recently did a schema compaction, which wouldn't tolerate moving across the releases without the (correct) intermediate step.
We definitely should fix the problem related to compute records being old and causing the controllers to start. However, at the moment, you should still assume that each intermediate release needs to be db-sync'd unless you've tested that a particular source and target release works. I expect the same requirement for most other projects.
--Dan
On Mon, 2022-01-10 at 10:50 -0800, Dan Smith wrote:
Good to know that it is not necessary for nova to go through ALL intermediate releases and perform db-sync The question is if this is true for ALL openstack services (in our deployment the controller node is used for all services and not only for nova)
Actually, Sean is wrong here - we do expect you to go through each release on the controller, it's just that it's rare that it's actually a problem. We have had blocker migrations at times in the past where we have had to ensure that data is migrated before changing or dropping items of schema. We also recently did a schema compaction, which wouldn't tolerate moving across the releases without the (correct) intermediate step.
dan is correct. you should run each on the contoler back to back. between train and wallaby specifical and we are in a special case where we just happen to not change the db in those releases. xena we started doing db compation yes and moving to alembic instead of sqlachmy-migrate.
from a cli point of view that is transparent at the nova manage level but it is still best to do it each release on the contoler to ensure that tanstion happens corretly.
We definitely should fix the problem related to compute records being old and causing the controllers to start. However, at the moment, you should still assume that each intermediate release needs to be db-sync'd unless you've tested that a particular source and target release works. I expect the same requirement for most other projects.
we have not tested skiping them on the contolers but i bevile in this case it woudl work ok to go directly from train to the wallaby code base and do the db sync. train to xena may not work. if the start and end version were different there is not guarentted that it woudl work due to the blocker migration, online migrations and eventual drop of migration code that dan mentioned. but ya unless you have tested it better to assume you cant skip.
--Dan
On 10/01, Dan Smith wrote:
Good to know that it is not necessary for nova to go through ALL intermediate releases and perform db-sync The question is if this is true for ALL openstack services (in our deployment the controller node is used for all services and not only for nova)
Actually, Sean is wrong here - we do expect you to go through each release on the controller, it's just that it's rare that it's actually a problem. We have had blocker migrations at times in the past where we have had to ensure that data is migrated before changing or dropping items of schema. We also recently did a schema compaction, which wouldn't tolerate moving across the releases without the (correct) intermediate step.
We definitely should fix the problem related to compute records being old and causing the controllers to start. However, at the moment, you should still assume that each intermediate release needs to be db-sync'd unless you've tested that a particular source and target release works. I expect the same requirement for most other projects.
--Dan
Hi,
Unrelated to this Nova issue, but related to why intermediate releases cannot be skipped in OpenStack, the Cinder project requires that the db sync and the online data migrations are run on each intermediate release.
You may be lucky and everything may run fine, but it might as well blow up in your face and lose database data.
Cheers, Gorka.
I had the chance to repeat this test So the scenario is:
1) controller and compute nodes running train 2) all services stopped in compute nodes 3) controller updated: train-->ussuri-->victoria--> wallaby
After that nova conductor and nova scheduler refuses to start [*]
At that moment nova-compute services were not running on the compute nodes And this was the status on the services table:
mysql> select * from services where topic="compute"; +---------------------+---------------------+------------+----+-----------------------------+--------------+---------+--------------+----------+---------+-------------------------------------+---------------------+-------------+---------+--------------------------------------+ | created_at | updated_at | deleted_at | id | host | binary | topic | report_count | disabled | deleted | disabled_reason | last_seen_up | forced_down | version | uuid | +---------------------+---------------------+------------+----+-----------------------------+--------------+---------+--------------+----------+---------+-------------------------------------+---------------------+-------------+---------+--------------------------------------+ | 2018-01-11 17:20:34 | 2022-02-25 09:09:17 | NULL | 17 | compute-01.cloud.pd.infn.it | nova-compute | compute | 10250811 | 1 | 0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25 09:09:13 | 0 | 40 | 2f56b8cf-1190-4999-af79-6bcee695c653 | | 2018-01-11 17:26:39 | 2022-02-25 09:09:49 | NULL | 23 | compute-02.cloud.pd.infn.it | nova-compute | compute | 10439622 | 1 | 0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25 09:09:49 | 0 | 40 | fbe37dfd-4a6c-4da1-96e0-407f7f98c4c4 | | 2018-01-11 17:27:12 | 2022-02-25 09:10:02 | NULL | 24 | compute-03.cloud.pd.infn.it | nova-compute | compute | 10361295 | 1 | 0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25 09:10:02 | 0 | 40 | 3675f324-81dd-445a-b4eb-510726104be3 | | 2021-04-06 12:54:42 | 2022-02-25 09:10:02 | NULL | 25 | compute-04.cloud.pd.infn.it | nova-compute | compute | 1790955 | 1 | 0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25 09:10:02 | 0 | 40 | e3e7af4d-b25b-410c-983e-8128a5e97219 | +---------------------+---------------------+------------+----+-----------------------------+--------------+---------+--------------+----------+---------+-------------------------------------+---------------------+-------------+---------+--------------------------------------+ 4 rows in set (0.00 sec)
Only after manually setting the version field of these entries to '54', nova-conductor and nova-scheduler were able to start
Regards, Massimo
[*] 2022-02-25 15:06:03.992 591600 CRITICAL nova [req-cc20f294-cced-434b-98cd-5bdf228a2a22 - - - - -] Unhandled error: nova.exception.TooOldComputeService: Current Nova ve rsion does not support computes older than Wallaby but the minimum compute service level in your system is 40 and the oldest supported service level is 54. 2022-02-25 15:06:03.992 591600 ERROR nova Traceback (most recent call last): 2022-02-25 15:06:03.992 591600 ERROR nova File "/usr/bin/nova-conductor", line 10, in <module> 2022-02-25 15:06:03.992 591600 ERROR nova sys.exit(main()) 2022-02-25 15:06:03.992 591600 ERROR nova File "/usr/lib/python3.6/site-packages/nova/cmd/conductor.py", line 46, in main 2022-02-25 15:06:03.992 591600 ERROR nova topic=rpcapi.RPC_TOPIC) 2022-02-25 15:06:03.992 591600 ERROR nova File "/usr/lib/python3.6/site-packages/nova/service.py", line 264, in create 2022-02-25 15:06:03.992 591600 ERROR nova utils.raise_if_old_compute() 2022-02-25 15:06:03.992 591600 ERROR nova File "/usr/lib/python3.6/site-packages/nova/utils.py", line 1098, in raise_if_old_compute 2022-02-25 15:06:03.992 591600 ERROR nova oldest_supported_service=oldest_supported_service_level) 2022-02-25 15:06:03.992 591600 ERROR nova nova.exception.TooOldComputeService: Current Nova version does not support computes older than Wallaby but the minimum comput e service level in your system is 40 and the oldest supported service level is 54. 2022-02-25 15:06:03.992 591600 ERROR nova
On Mon, Jan 10, 2022 at 6:00 PM Massimo Sgaravatto < massimo.sgaravatto@gmail.com> wrote:
Dear all
When we upgraded our Cloud from Rocky to Train we followed the following procedure:
- Shutdown of all services on the controller and compute nodes
- Update from Rocky to Stein of controller (just to do the dbsyncs)
- Update from Stein to Train of controller
- Update from Rocky to Train of compute nodes
We are trying to do the same to update from Train to Xena, but now there is a problem because nova services on the controller node refuse to start since they find too old compute nodes (this is indeed a new feature, properly documented in the release notes). As a workaround we had to manually modify the "version" field of the compute nodes in the nova.services table.
Is it ok, or is there a cleaner way to manage the issue ?
Thanks, Massimo
On Fri, 2022-02-25 at 16:50 +0100, Massimo Sgaravatto wrote:
I had the chance to repeat this test So the scenario is:
- controller and compute nodes running train
- all services stopped in compute nodes
- controller updated: train-->ussuri-->victoria--> wallaby
After that nova conductor and nova scheduler refuses to start [*]
yes nova does not offially support n to n+3 upgrade we started enforcing that a few release ago. there is a workaround config option that we recently added that turns the error into a waring https://docs.openstack.org/nova/latest/configuration/config.html#workarounds... that is one option or you can implement or before you upgrade the contoler you can force-down all the comptue nodes
At that moment nova-compute services were not running on the compute nodes And this was the status on the services table:
mysql> select * from services where topic="compute"; +---------------------+---------------------+------------+----+-----------------------------+--------------+---------+--------------+----------+---------+-------------------------------------+---------------------+-------------+---------+--------------------------------------+
created_at | updated_at | deleted_at | id | host
| binary | topic | report_count | disabled |
deleted | disabled_reason | last_seen_up | forced_down | version | uuid | +---------------------+---------------------+------------+----+-----------------------------+--------------+---------+--------------+----------+---------+-------------------------------------+---------------------+-------------+---------+--------------------------------------+
2018-01-11 17:20:34 | 2022-02-25 09:09:17 | NULL | 17 |
compute-01.cloud.pd.infn.it | nova-compute | compute | 10250811 | 1 | 0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25 09:09:13 | 0 | 40 | 2f56b8cf-1190-4999-af79-6bcee695c653 |
2018-01-11 17:26:39 | 2022-02-25 09:09:49 | NULL | 23 |
compute-02.cloud.pd.infn.it | nova-compute | compute | 10439622 | 1 | 0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25 09:09:49 | 0 | 40 | fbe37dfd-4a6c-4da1-96e0-407f7f98c4c4 |
2018-01-11 17:27:12 | 2022-02-25 09:10:02 | NULL | 24 |
compute-03.cloud.pd.infn.it | nova-compute | compute | 10361295 | 1 | 0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25 09:10:02 | 0 | 40 | 3675f324-81dd-445a-b4eb-510726104be3 |
2021-04-06 12:54:42 | 2022-02-25 09:10:02 | NULL | 25 |
compute-04.cloud.pd.infn.it | nova-compute | compute | 1790955 | 1 | 0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25 09:10:02 | 0 | 40 | e3e7af4d-b25b-410c-983e-8128a5e97219 | +---------------------+---------------------+------------+----+-----------------------------+--------------+---------+--------------+----------+---------+-------------------------------------+---------------------+-------------+---------+--------------------------------------+ 4 rows in set (0.00 sec)
Only after manually setting the version field of these entries to '54', nova-conductor and nova-scheduler were able to start
Regards, Massimo
[*] 2022-02-25 15:06:03.992 591600 CRITICAL nova [req-cc20f294-cced-434b-98cd-5bdf228a2a22 - - - - -] Unhandled error: nova.exception.TooOldComputeService: Current Nova ve rsion does not support computes older than Wallaby but the minimum compute service level in your system is 40 and the oldest supported service level is 54. 2022-02-25 15:06:03.992 591600 ERROR nova Traceback (most recent call last): 2022-02-25 15:06:03.992 591600 ERROR nova File "/usr/bin/nova-conductor", line 10, in <module> 2022-02-25 15:06:03.992 591600 ERROR nova sys.exit(main()) 2022-02-25 15:06:03.992 591600 ERROR nova File "/usr/lib/python3.6/site-packages/nova/cmd/conductor.py", line 46, in main 2022-02-25 15:06:03.992 591600 ERROR nova topic=rpcapi.RPC_TOPIC) 2022-02-25 15:06:03.992 591600 ERROR nova File "/usr/lib/python3.6/site-packages/nova/service.py", line 264, in create 2022-02-25 15:06:03.992 591600 ERROR nova utils.raise_if_old_compute() 2022-02-25 15:06:03.992 591600 ERROR nova File "/usr/lib/python3.6/site-packages/nova/utils.py", line 1098, in raise_if_old_compute 2022-02-25 15:06:03.992 591600 ERROR nova oldest_supported_service=oldest_supported_service_level) 2022-02-25 15:06:03.992 591600 ERROR nova nova.exception.TooOldComputeService: Current Nova version does not support computes older than Wallaby but the minimum comput e service level in your system is 40 and the oldest supported service level is 54. 2022-02-25 15:06:03.992 591600 ERROR nova
On Mon, Jan 10, 2022 at 6:00 PM Massimo Sgaravatto < massimo.sgaravatto@gmail.com> wrote:
Dear all
When we upgraded our Cloud from Rocky to Train we followed the following procedure:
- Shutdown of all services on the controller and compute nodes
- Update from Rocky to Stein of controller (just to do the dbsyncs)
- Update from Stein to Train of controller
- Update from Rocky to Train of compute nodes
We are trying to do the same to update from Train to Xena, but now there is a problem because nova services on the controller node refuse to start since they find too old compute nodes (this is indeed a new feature, properly documented in the release notes). As a workaround we had to manually modify the "version" field of the compute nodes in the nova.services table.
Is it ok, or is there a cleaner way to manage the issue ?
Thanks, Massimo
Thanks This "disable_compute_service_check_for_ffu" option is not available in xena, correct ? Cheers, Massimo
On Fri, Feb 25, 2022 at 5:15 PM Sean Mooney smooney@redhat.com wrote:
On Fri, 2022-02-25 at 16:50 +0100, Massimo Sgaravatto wrote:
I had the chance to repeat this test So the scenario is:
- controller and compute nodes running train
- all services stopped in compute nodes
- controller updated: train-->ussuri-->victoria--> wallaby
After that nova conductor and nova scheduler refuses to start [*]
yes nova does not offially support n to n+3 upgrade we started enforcing that a few release ago. there is a workaround config option that we recently added that turns the error into a waring https://docs.openstack.org/nova/latest/configuration/config.html#workarounds... that is one option or you can implement or before you upgrade the contoler you can force-down all the comptue nodes
At that moment nova-compute services were not running on the compute
nodes
And this was the status on the services table:
mysql> select * from services where topic="compute";
+---------------------+---------------------+------------+----+-----------------------------+--------------+---------+--------------+----------+---------+-------------------------------------+---------------------+-------------+---------+--------------------------------------+
created_at | updated_at | deleted_at | id | host
| binary | topic | report_count | disabled |
deleted | disabled_reason | last_seen_up | forced_down | version | uuid |
+---------------------+---------------------+------------+----+-----------------------------+--------------+---------+--------------+----------+---------+-------------------------------------+---------------------+-------------+---------+--------------------------------------+
2018-01-11 17:20:34 | 2022-02-25 09:09:17 | NULL | 17 |
compute-01.cloud.pd.infn.it | nova-compute | compute | 10250811 | 1 | 0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25 09:09:13
|
0 | 40 | 2f56b8cf-1190-4999-af79-6bcee695c653 |
2018-01-11 17:26:39 | 2022-02-25 09:09:49 | NULL | 23 |
compute-02.cloud.pd.infn.it | nova-compute | compute | 10439622 | 1 | 0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25 09:09:49
|
0 | 40 | fbe37dfd-4a6c-4da1-96e0-407f7f98c4c4 |
2018-01-11 17:27:12 | 2022-02-25 09:10:02 | NULL | 24 |
compute-03.cloud.pd.infn.it | nova-compute | compute | 10361295 | 1 | 0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25 09:10:02
|
0 | 40 | 3675f324-81dd-445a-b4eb-510726104be3 |
2021-04-06 12:54:42 | 2022-02-25 09:10:02 | NULL | 25 |
compute-04.cloud.pd.infn.it | nova-compute | compute | 1790955 | 1 | 0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25 09:10:02
|
0 | 40 | e3e7af4d-b25b-410c-983e-8128a5e97219 |
+---------------------+---------------------+------------+----+-----------------------------+--------------+---------+--------------+----------+---------+-------------------------------------+---------------------+-------------+---------+--------------------------------------+
4 rows in set (0.00 sec)
Only after manually setting the version field of these entries to '54', nova-conductor and nova-scheduler were able to start
Regards, Massimo
[*] 2022-02-25 15:06:03.992 591600 CRITICAL nova [req-cc20f294-cced-434b-98cd-5bdf228a2a22 - - - - -] Unhandled error: nova.exception.TooOldComputeService: Current Nova ve rsion does not support computes older than Wallaby but the minimum
compute
service level in your system is 40 and the oldest supported service level is 54. 2022-02-25 15:06:03.992 591600 ERROR nova Traceback (most recent call
last):
2022-02-25 15:06:03.992 591600 ERROR nova File
"/usr/bin/nova-conductor",
line 10, in <module> 2022-02-25 15:06:03.992 591600 ERROR nova sys.exit(main()) 2022-02-25 15:06:03.992 591600 ERROR nova File "/usr/lib/python3.6/site-packages/nova/cmd/conductor.py", line 46, in
main
2022-02-25 15:06:03.992 591600 ERROR nova topic=rpcapi.RPC_TOPIC) 2022-02-25 15:06:03.992 591600 ERROR nova File "/usr/lib/python3.6/site-packages/nova/service.py", line 264, in create 2022-02-25 15:06:03.992 591600 ERROR nova
utils.raise_if_old_compute()
2022-02-25 15:06:03.992 591600 ERROR nova File "/usr/lib/python3.6/site-packages/nova/utils.py", line 1098, in raise_if_old_compute 2022-02-25 15:06:03.992 591600 ERROR nova oldest_supported_service=oldest_supported_service_level) 2022-02-25 15:06:03.992 591600 ERROR nova nova.exception.TooOldComputeService: Current Nova version does not
support
computes older than Wallaby but the minimum comput e service level in your system is 40 and the oldest supported service
level
is 54. 2022-02-25 15:06:03.992 591600 ERROR nova
On Mon, Jan 10, 2022 at 6:00 PM Massimo Sgaravatto < massimo.sgaravatto@gmail.com> wrote:
Dear all
When we upgraded our Cloud from Rocky to Train we followed the
following
procedure:
- Shutdown of all services on the controller and compute nodes
- Update from Rocky to Stein of controller (just to do the dbsyncs)
- Update from Stein to Train of controller
- Update from Rocky to Train of compute nodes
We are trying to do the same to update from Train to Xena, but now
there
is a problem because nova services on the controller node refuse to start since they find
too
old compute nodes (this is indeed a new feature, properly documented
in the
release notes). As a workaround we had to manually modify the "version" field of the compute nodes in the nova.services table.
Is it ok, or is there a cleaner way to manage the issue ?
Thanks, Massimo
On Fri, Feb 25 2022 at 05:57:26 PM +0100, Massimo Sgaravatto massimo.sgaravatto@gmail.com wrote:
Thanks This "disable_compute_service_check_for_ffu" option is not available in xena, correct ?
Not yet. But now I've proposed the backport of that fix to stable/xena[1]
Cheers, gibi
[1] https://review.opendev.org/c/openstack/nova/+/831174
Cheers, Massimo
On Fri, Feb 25, 2022 at 5:15 PM Sean Mooney smooney@redhat.com wrote:
On Fri, 2022-02-25 at 16:50 +0100, Massimo Sgaravatto wrote:
I had the chance to repeat this test So the scenario is:
- controller and compute nodes running train
- all services stopped in compute nodes
- controller updated: train-->ussuri-->victoria--> wallaby
After that nova conductor and nova scheduler refuses to start [*]
yes nova does not offially support n to n+3 upgrade we started enforcing that a few release ago. there is a workaround config option that we recently added that turns the error into a waring https://docs.openstack.org/nova/latest/configuration/config.html#workarounds... that is one option or you can implement or before you upgrade the contoler you can force-down all the comptue nodes
At that moment nova-compute services were not running on the
compute nodes
And this was the status on the services table:
mysql> select * from services where topic="compute";
+---------------------+---------------------+------------+----+-----------------------------+--------------+---------+--------------+----------+---------+-------------------------------------+---------------------+-------------+---------+--------------------------------------+
created_at | updated_at | deleted_at | id |
host
| binary | topic | report_count |
disabled |
deleted | disabled_reason | last_seen_up
|
forced_down | version | uuid |
+---------------------+---------------------+------------+----+-----------------------------+--------------+---------+--------------+----------+---------+-------------------------------------+---------------------+-------------+---------+--------------------------------------+
2018-01-11 17:20:34 | 2022-02-25 09:09:17 | NULL | 17 |
compute-01.cloud.pd.infn.it | nova-compute | compute |
10250811 |
1 | 0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25
09:09:13 |
0 | 40 | 2f56b8cf-1190-4999-af79-6bcee695c653 |
2018-01-11 17:26:39 | 2022-02-25 09:09:49 | NULL | 23 |
compute-02.cloud.pd.infn.it | nova-compute | compute |
10439622 |
1 | 0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25
09:09:49 |
0 | 40 | fbe37dfd-4a6c-4da1-96e0-407f7f98c4c4 |
2018-01-11 17:27:12 | 2022-02-25 09:10:02 | NULL | 24 |
compute-03.cloud.pd.infn.it | nova-compute | compute |
10361295 |
1 | 0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25
09:10:02 |
0 | 40 | 3675f324-81dd-445a-b4eb-510726104be3 |
2021-04-06 12:54:42 | 2022-02-25 09:10:02 | NULL | 25 |
compute-04.cloud.pd.infn.it | nova-compute | compute |
1790955 |
1 | 0 | AUTO: Connection to libvirt lost: 1 | 2022-02-25
09:10:02 |
0 | 40 | e3e7af4d-b25b-410c-983e-8128a5e97219 |
+---------------------+---------------------+------------+----+-----------------------------+--------------+---------+--------------+----------+---------+-------------------------------------+---------------------+-------------+---------+--------------------------------------+
4 rows in set (0.00 sec)
Only after manually setting the version field of these entries to
'54',
nova-conductor and nova-scheduler were able to start
Regards, Massimo
[*] 2022-02-25 15:06:03.992 591600 CRITICAL nova [req-cc20f294-cced-434b-98cd-5bdf228a2a22 - - - - -] Unhandled
error:
nova.exception.TooOldComputeService: Current Nova ve rsion does not support computes older than Wallaby but the
minimum compute
service level in your system is 40 and the oldest supported
service level
is 54. 2022-02-25 15:06:03.992 591600 ERROR nova Traceback (most recent
call last):
2022-02-25 15:06:03.992 591600 ERROR nova File
"/usr/bin/nova-conductor",
line 10, in <module> 2022-02-25 15:06:03.992 591600 ERROR nova sys.exit(main()) 2022-02-25 15:06:03.992 591600 ERROR nova File "/usr/lib/python3.6/site-packages/nova/cmd/conductor.py", line
46, in main
2022-02-25 15:06:03.992 591600 ERROR nova
topic=rpcapi.RPC_TOPIC)
2022-02-25 15:06:03.992 591600 ERROR nova File "/usr/lib/python3.6/site-packages/nova/service.py", line 264, in
create
2022-02-25 15:06:03.992 591600 ERROR nova
utils.raise_if_old_compute()
2022-02-25 15:06:03.992 591600 ERROR nova File "/usr/lib/python3.6/site-packages/nova/utils.py", line 1098, in raise_if_old_compute 2022-02-25 15:06:03.992 591600 ERROR nova oldest_supported_service=oldest_supported_service_level) 2022-02-25 15:06:03.992 591600 ERROR nova nova.exception.TooOldComputeService: Current Nova version does
not support
computes older than Wallaby but the minimum comput e service level in your system is 40 and the oldest supported
service level
is 54. 2022-02-25 15:06:03.992 591600 ERROR nova
On Mon, Jan 10, 2022 at 6:00 PM Massimo Sgaravatto < massimo.sgaravatto@gmail.com> wrote:
Dear all
When we upgraded our Cloud from Rocky to Train we followed the
following
procedure:
- Shutdown of all services on the controller and compute nodes
- Update from Rocky to Stein of controller (just to do the
dbsyncs)
- Update from Stein to Train of controller
- Update from Rocky to Train of compute nodes
We are trying to do the same to update from Train to Xena, but
now there
is a problem because nova services on the controller node refuse to start since
they find too
old compute nodes (this is indeed a new feature, properly
documented in the
release notes). As a workaround we had to manually modify the "version" field
of the
compute nodes in the nova.services table.
Is it ok, or is there a cleaner way to manage the issue ?
Thanks, Massimo
participants (5)
-
Balazs Gibizer
-
Dan Smith
-
Gorka Eguileor
-
Massimo Sgaravatto
-
Sean Mooney