Dear all

I'm running multiple Openstack production clusters for years. Currently, our clusters hold approx 600 instances mainly for educational purposes.

We're deploying the environment using Kolla-Ansible. Host OS Ubuntu Jammy, using the Jammy-Kolla containers.

At the moment we're running Bobcat and are testing the Upgrade to Caracal (2024.1).

To make the problem reproduceable I set up a one node testenvironment.

Host OS: Ubuntu Jammy running on KVM.
Kolla-Ansible Version: 18.1.1.dev1
Ansible-Version: 9.8.0 (ansible-core 2.16.9)
Kolla config: 

Deployment runs smoothly. 
Post deploy tests:

To setup network, flavors, etc, I'm running an adapted init_runonce script, creating an external (flat) and an internal (geneve) network and a router.

Spinning up an instance leads to a fatal error on nova:

--- Dashboard ---
Error: Failed to perform requested operation on instance "test", the instance has an error status: Please try again later [Error: Build of instance 38488604-3d83-4676-b0c2-a4218696eba3 aborted: Failed to allocate the network(s), not rescheduling.].
---

--- /var/log/kolla/nova/nova-compute.log ---
...
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [None req-14f0d15f-5892-406b-a750-698ec54d9b70 ec93e00940164f30b5076a93c0a24722 499b452a194d4cbeb401302eaef1304a - - default default] [instance: 38488604-3d83-4676-b0c2-a4218696eba3] Instance failed to spawn: nova.exception.VirtualInterfaceCreateException: Virtual Interface creation failed
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3] Traceback (most recent call last):
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]   File "/var/lib/kolla/venv/lib/python3.10/site-packages/nova/virt/libvirt/driver.py", line 8002, in _create_guest_with_network
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]     with self.virtapi.wait_for_instance_event(
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]   File "/usr/lib/python3.10/contextlib.py", line 142, in __exit__
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]     next(self.gen)
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]   File "/var/lib/kolla/venv/lib/python3.10/site-packages/nova/compute/manager.py", line 559, in wait_for_instance_event
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]     self._wait_for_instance_events(
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]   File "/var/lib/kolla/venv/lib/python3.10/site-packages/nova/compute/manager.py", line 471, in _wait_for_instance_events
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]     actual_event = event.wait()
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]   File "/var/lib/kolla/venv/lib/python3.10/site-packages/nova/compute/manager.py", line 436, in wait
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]     instance_event = self.event.wait()
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]   File "/var/lib/kolla/venv/lib/python3.10/site-packages/eventlet/event.py", line 124, in wait
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]     result = hub.switch()
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]   File "/var/lib/kolla/venv/lib/python3.10/site-packages/eventlet/hubs/hub.py", line 310, in switch
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]     return self.greenlet.switch()
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3] eventlet.timeout.Timeout: 300 seconds
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3] 
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3] During handling of the above exception, another exception occurred:
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3] 
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3] Traceback (most recent call last):
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]   File "/var/lib/kolla/venv/lib/python3.10/site-packages/nova/compute/manager.py", line 2885, in _build_resources
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]     yield resources
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]   File "/var/lib/kolla/venv/lib/python3.10/site-packages/nova/compute/manager.py", line 2632, in _build_and_run_instance
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]     self.driver.spawn(context, instance, image_meta,
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]   File "/var/lib/kolla/venv/lib/python3.10/site-packages/nova/virt/libvirt/driver.py", line 4642, in spawn
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]     self._create_guest_with_network(
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]   File "/var/lib/kolla/venv/lib/python3.10/site-packages/nova/virt/libvirt/driver.py", line 8028, in _create_guest_with_network
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3]     raise exception.VirtualInterfaceCreateException()
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3] nova.exception.VirtualInterfaceCreateException: Virtual Interface creation failed
2024-08-07 15:33:02.742 7 ERROR nova.compute.manager [instance: 38488604-3d83-4676-b0c2-a4218696eba3] 
2024-08-07 15:33:02.753 7 INFO nova.compute.manager [None req-14f0d15f-5892-406b-a750-698ec54d9b70 ec93e00940164f30b5076a93c0a24722 499b452a194d4cbeb401302eaef1304a - - default default] [instance: 38488604-3d83-4676-b0c2-a4218696eba3] Terminating instance
...
---

The same configuration works without error under Bobcat (kolla-ansible: 17.4.0, ansible: 8.4.0)

When I upgrade a running Bobcat configuration to Caracal, I run into the same issue. When running the upgrade with "--skip-tags openvswitch" the upgraded environment runs perfectly (with the bobcat openvswitch containers). As soon as I upgrade those, the deployment is broken again.

I clouldn't find a bug on launchpad which points to this issue. I coudn't find any info in the release notes of caracal which solves my problem.

Does anyone struggling on the same issue as well? Do you have any hint? 

I'm appreciating any comment, help, ...

Thanks Remo Maurer