The default RMQ config is broken. You're on the right track with setting durable_queues, but there's more to do. I'm running kolla Train with mirrored/durable queues and my clusters work fine with a controller down. One issue that we faced after setting durable was that we weren't running redis, and then when we tried to run it the network was blocking the port, but eventually we got it working. Some have recommended not mirroring queues; I haven't tried that. If anyone has successfully setup HA without mirrored queues, I'd be interested to hear about how you did it. Here are some helpful links: https://wiki.openstack.org/wiki/Large_Scale_Configuration_Rabbit https://lists.openstack.org/pipermail/openstack-discuss/2021-November/026074... https://lists.openstack.org/pipermail/openstack-discuss/2020-August/016362.h... https://lists.openstack.org/pipermail/openstack-discuss/2020-August/016524.h... https://review.opendev.org/c/openstack/kolla-ansible/+/822191 https://review.opendev.org/c/openstack/kolla-ansible/+/824994 On Thursday, July 21, 2022, 02:42:42 PM EDT, Tan Tran Trong <gk.coltech@gmail.com> wrote: Hello,I'm trying to figure out how to configure RabbitMQ to make it high available. I have 3 controller nodes and 2 compute nodes, deployed with kolla with mostly default configuration. The RabbitMQ set to ha-all for all queues on all nodes, amqp_durable_queues = TrueMy problem is when I shutdown 1 controller node (or 1 RabbitMQ container) (master or slave) the whole cluster becomes unstable. Some instances can not be created, it is stuck on Scheduling, Block Device Mapping, the volumes not shown or are stuck on creating, the compute node reported dead randomly,...I'm looking for documentation to know how Openstack using RabbitMQ, Openstack behavior when RabbitMQ node down and way to make RabbitMQ HA in a stable way. Do you have any recommendation? TIA,Tan