On 06/06/2025 14:43, Thomas Goirand wrote:
On 6/4/25 09:38, Arnaud Morin wrote:
Here is the snippet of config that I use to enable quorums:
[oslo_messaging_rabbit] amqp_durable_queues=True rabbit_transient_queues_ttl=0 rabbit_quorum_queue=True use_queue_manager=True rabbit_stream_fanout=True rabbit_transient_quorum_queue=True rabbit_qos_prefetch_count=100
Thanks for the above. Looks like I had nearly all of them already.
I was trying to switch to quorum queues because RabbitMQ 4.x in Trixie doesn't support HA queues. It seems to work now, but then I'm hitting hard the fact that Nova is currently completely broken with Python 3.13 (due to Eventlet [1]). Indeed, due to it, a compute cannot even fetch a token from Keystone to talk to Neutron anymore.
This increasingly worries me, as I'm not seeing any solution to it. :/ Disabling the garbage collector of course makes it work (which proves the author of the bug report is right in his diagnostic), but then everything in OpenStack has memory leaks (obviously, this is more troubles than a solution).
the only solution form the openstack sdie that i can think fo right now is to finish the inital seriese to supprot threading mode. the problem with that is nova-comptue will be the last part we do becuase its the hardest part to move as it is the most relient on eventlet to funcion. our eta to start on that is next cycle so late q4 early q4 this year. proably around the time of the next ptg. because of how eventlet work we cant simple freeze and unfreeze the garbage collector when we construct the rest clients nor can we jsut disable the garbage collecttor and perodicly run it when there are no greenthread in flight so unless we can fix eventlet its going to be a problem. there is one thing you could try which is to see if this also happens if you use the asyncio eventloop in eventlet. i think that is still broken by an issue in oslo.logging but i would suggest exporting |EVENTLET_HUB=asyncio when runnign nova-comptue adn see if that helps. |
Cheers,
Thomas Goirand (zigo)