[kolla] RabbitMQ High Availability
Tan Tran Trong
gk.coltech at gmail.com
Mon Jul 25 03:43:05 UTC 2022
My RMQ version is: 3.8.32
I deployed the xena version using kolla-ansible on Ubuntu 20.04.
Right now my cluster running no ha + amqp_durable_queues = False, when I
shut 1 controller and create instance I got the error on nova-scheduler:
2022-07-25 10:36:41.496 688 ERROR root
oslo_messaging.exceptions.MessageDeliveryFailure: Unable to connect to AMQP
server on x.x.x.x:5672 after inf tries: Queue.declare: (404) NOT_FOUND -
home node 'rabbit at control02' of durable queue
'scheduler' in vhost '/' is down or inaccessible
On Sun, Jul 24, 2022 at 1:20 AM Satish Patel <satish.txt at gmail.com> wrote:
> Something is wrong with your version or rabbitMQ version. Make sure you
> are not dealing with bug. I have 3 node cluster and it always survive if I
> shutdown one of controller node. It works prefect fine without issue. Even
> with HA or nonHA config.
> What version of openstack and rabbitMQ are you running ?
> Sent from my iPhone
> On Jul 23, 2022, at 1:29 PM, Tan Tran Trong <gk.coltech at gmail.com> wrote:
> Thank you guys for your links. Actually I moved from no durable queues +
> no HA policy to durable queues + ha-all policy. The result is still the
> same. Tried to turning using
> https://wiki.openstack.org/wiki/Large_Scale_Configuration_Rabbit but
> still missing something I guess.
> @Albert: Have you tested the case when you shutdown 1 controller -> thing
> works -> power it on -> shutdown another controller? In my case the cluster
> is not stable after that.
> And by "work fine" you mean you don't have to do anything (restart
> rabbitmq, restart openstack services) when 1 controller is down, do you? I
> know it sounds silly, but we end up using internal keepalived VIP only for
> all transport settings which remove loadbalancing but keep my cluster
> stable when 1 node down, really don't know if it will cause trouble later
> when the cluster grows.
> On Fri, Jul 22, 2022 at 10:53 PM Albert Braden <ozzzo at yahoo.com> wrote:
>> The default RMQ config is broken. You're on the right track with setting
>> durable_queues, but there's more to do. I'm running kolla Train with
>> mirrored/durable queues and my clusters work fine with a controller down.
>> One issue that we faced after setting durable was that we weren't running
>> redis, and then when we tried to run it the network was blocking the port,
>> but eventually we got it working.
>> Some have recommended not mirroring queues; I haven't tried that. If
>> anyone has successfully setup HA without mirrored queues, I'd be interested
>> to hear about how you did it.
>> Here are some helpful links:
>> On Thursday, July 21, 2022, 02:42:42 PM EDT, Tan Tran Trong <
>> gk.coltech at gmail.com> wrote:
>> I'm trying to figure out how to configure RabbitMQ to make it high
>> available. I have 3 controller nodes and 2 compute nodes, deployed with
>> kolla with mostly default configuration. The RabbitMQ set to ha-all for all
>> queues on all nodes, amqp_durable_queues = True
>> My problem is when I shutdown 1 controller node (or 1 RabbitMQ container)
>> (master or slave) the whole cluster becomes unstable. Some instances can
>> not be created, it is stuck on Scheduling, Block Device Mapping, the
>> volumes not shown or are stuck on creating, the compute node reported dead
>> I'm looking for documentation to know how Openstack using RabbitMQ,
>> Openstack behavior when RabbitMQ node down and way to make RabbitMQ HA in a
>> stable way. Do you have any recommendation?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the openstack-discuss