Any update on this bug https://bugs.launchpad.net/nova/+bug/2076614 ? On Tue, 22 Apr 2025, 19:45 Eugen Block, <eblock@nde.ag> wrote:
Found one for the nova-manage issue:
https://bugs.launchpad.net/nova/+bug/2076614
But it's triaged without any details about a recommended workaround, or the ideas mentioned in the last comment.
Zitat von Eugen Block <eblock@nde.ag>:
Thanks, yeah we did debug something similar months ago when we upgraded from V to W or something. In our production cloud, there was no such instance (00000000-0000-0000-0000-000000000000), so we created that entry to satisfy the nova db requirements. This is a virtual lab environment where the instance was present (that's why we didn't face that issue during the test upgrade).
I guess I could update the instances table with a valid compute_id since the error from nova-manage is:
2025-04-22 13:18:08.322 19087 ERROR nova.objects.instance [None req-542f4e62-e738-49aa-aebd-4803b61585ac - - - - - -] [instance: 00000000-0000-0000-0000-000000000000] Unable to migrate instance because host None with node None not found: nova.exception.ComputeHostNotFound: Compute host None could not be found.
I'll check for existing bug reports.
Zitat von engineer2024 <engineerlinux2024@gmail.com>:
I faced a similar issue from A to C when running nova db online migrations. strangely some process in nova is creating a row in nova.instances table automatically with the uuid as '0000-0000-00000' and other column values for this entry as NULL.
On Tue, 22 Apr 2025, 19:22 Eugen Block, <eblock@nde.ag> wrote:
Hi Allison,
I was able to test the upgrade twice today (2 control nodes, 1 compute node). The first attempt was from A to C (slurp). I had several issues along the way. I haven't been able to look closely for any root causes yet, and I also didn't check for existing bugs yet, but I wanted to share anyway:
1. nova-manage db online_data_migrations: one entry doesn't migrate (1 rows matched query populate_instance_compute_id, 0 migrated)
2. cinder-manage db online_data_migrations: Running batches of 50 until complete. 2025-04-22 11:01:30.558 837433 WARNING py.warnings [None req-daa04f7f-7daf-4b46-ac78-eff68794cae5 - - - - - -] /usr/lib/python3/dist-packages/cinder/db/sqlalchemy/api.py:8620: SAWarning: Coercing Subquery object into a select() for use in IN(); please pass a select() construct explicitly filter(admin_meta_table.id.in_(ids_query)).\
+------------------------------------------------+----------------+-------------+
| Migration | Total Needed | Completed |
|------------------------------------------------+----------------+-------------|
| remove_temporary_admin_metadata_data_migration | 0 | 0 |
+------------------------------------------------+----------------+-------------+
3. Some neutron agents (dhcp, metadata) don't properly start until all control nodes are upgraded, I needed to stop and start them again. Need to look for more details in the logs.
4. Horizon has stopped working entirely. I only get a "400 Bad Request", no matter what I do. When the packages upgraded, I saw a warning "No local_settings file found", but it is there:
root@controller02:~# ll
/usr/lib/python3/dist-packages/openstack_dashboard/local/local_settings.py
lrwxrwxrwx 1 root root 42 Jun 7 2024
/usr/lib/python3/dist-packages/openstack_dashboard/local/local_settings.py
-> /etc/openstack-dashboard/local_settings.py
The dashboard_error.log also contains the warning that there's no local_settings file. I suspect that it has to do with the Django change mentioned in the Caracal release notes:
Django 3.2 support was dropped. Django 3.2 ends its extended support in April 2024. Considering this horizon dropped Django 3.2 support and uses Django 4.2 as default.
I decided to rollback the VMs to a previous snapshot (back to Antelope) and tried the upgrade to Bobcat. The result is better, I didn't see the neutron agents fail, and the dashboard is still usable. The only issues I still see are the 'nova-manage db online_data_migrations' and 'cinder-manage db online_data_migrations'.
If some or all of these issues are know, could anyone point me to the relevant bug reports? If these are new bugs, I can create reports for them. But as I mentioned, I don't have too much information yet.
Thanks, Eugen
Zitat von Allison Price <allison@openinfra.dev>:
Hi Eugen,
Excellent! Please keep me posted on how testing goes and then we can take the next steps in talking about your production environment.
Cheers, Allison
On Apr 21, 2025, at 3:10 PM, Eugen Block <eblock@nde.ag> wrote:
Hi and thank you for the links. We just upgraded to Antelope last week, and we plan to do the next upgrade directly to Caracal since we need one of the fixes. I’m planning to test the slurp upgrade on a lab environment quite soon, maybe this week even. And if all goes well, we’ll upgrade our production quickly after that. I‘d be happy to share my experience in this thread. :-)
Thanks! Eugen
Zitat von Allison Price <allison@openinfra.dev>:
> Hi everyone, > > We have now published two case studies from OpenStack users who > have implemented and are benefiting from the SLURP upgrade > process[1]: Cleura[2] and Indiana University[3] > > I wanted to follow up to see if there are other organizations who > have implemented the SLURP upgrade process who would like to tell > your story? If we can get a few more, I would love to schedule an > OpenInfra Live to deep dive into the improvements made around > OpenStack upgrades and what users stuck on older releases should > know. > > If you are interested, please let me know. And even if you’re not > but you’re still using OpenStack—please remember to take the > OpenStack User Survey[4] so we can learn more! > > Thanks! > Allison > > > > [1] >
https://docs.openstack.org/project-team-guide/release-cadence-adjustment.htm...
> [2]
https://superuser.openinfra.org/articles/streamlining-openstack-upgrades-man...
> [3] >
https://superuser.openinfra.org/articles/streamlining-openstack-upgrades-a-c...