Hi, Are you using ssl connections ? Can be this issue ? https://bugs.launchpad.net/ubuntu/+source/oslo.messaging/+bug/1800957 ________________________________ From: Laurent Dumont <laurentfdumont@gmail.com> Sent: Wednesday, July 31, 2019 4:20 PM To: Grant Morley Cc: openstack-operators@lists.openstack.org Subject: Re: Slow instance launch times due to RabbitMQ That is a bit strange, list_queues should return stuff. Couple of ideas : * Are the Rabbit connection failure logs on the compute pointing to a specific controller? * Are there any logs within Rabbit on the controller that would point to a transient issue? * cluster_status is a snapshot of the cluster at the time you ran the command. If the alarms have cleared, you won't see anything. * If you have the RabbitMQ management plugin activated, I would recommend a quick look to see the historical metrics and overall status. On Wed, Jul 31, 2019 at 9:35 AM Grant Morley <grant@civo.com<mailto:grant@civo.com>> wrote: Hi guys, We are using Ubuntu 16 and OpenStack ansible to do our setup. rabbitmqctl list_queues Listing queues (Doesn't appear to be any queues ) rabbitmqctl cluster_status Cluster status of node 'rabbit@management-1-rabbit-mq-container-b4d7791f' [{nodes,[{disc,['rabbit@management-1-rabbit-mq-container-b4d7791f', 'rabbit@management-2-rabbit-mq-container-b455e77d', 'rabbit@management-3-rabbit-mq-container-1d6ae377']}]}, {running_nodes,['rabbit@management-3-rabbit-mq-container-1d6ae377', 'rabbit@management-2-rabbit-mq-container-b455e77d', 'rabbit@management-1-rabbit-mq-container-b4d7791f']}, {cluster_name,<<"openstack">>}, {partitions,[]}, {alarms,[{'rabbit@management-3-rabbit-mq-container-1d6ae377',[]}, {'rabbit@management-2-rabbit-mq-container-b455e77d',[]}, {'rabbit@management-1-rabbit-mq-container-b4d7791f',[]}]}] Regards, On 31/07/2019 11:49, Laurent Dumont wrote: Could you forward the output of the following commands on a controller node? : rabbitmqctl cluster_status rabbitmqctl list_queues You won't necessarily see a high load on a Rabbit cluster that is in a bad state. On Wed, Jul 31, 2019 at 5:19 AM Grant Morley <grant@civo.com<mailto:grant@civo.com>> wrote: Hi all, We are randomly seeing slow instance launch / deletion times and it appears to be because of RabbitMQ. We are seeing a lot of these messages in the logs for Nova and Neutron: ERROR oslo.messaging._drivers.impl_rabbit [-] [f4ab3ca0-b837-4962-95ef-dfd7d60686b6] AMQP server on 10.6.2.212:5671<http://10.6.2.212:5671> is unreachable: Too many heartbeats missed. Trying again in 1 seconds. Client port: 37098: ConnectionForced: Too many heartbeats missed The RabbitMQ cluster isn't under high load and I am not seeing any packets drop over the network when I do some tracing. We are only running 15 compute nodes currently and have >1000 instances so it isn't a large deployment. Are there any good configuration tweaks for RabbitMQ running on OpenStack Queens? Many Thanks, -- [https://www.civo.com/images/email-logo.jpg] Grant Morley Cloud Lead, Civo Ltd www.civo.com<https://www.civo.com/> | Signup for an account!<https://www.civo.com/signup> -- [https://www.civo.com/images/email-logo.jpg] Grant Morley Cloud Lead, Civo Ltd www.civo.com<https://www.civo.com/> | Signup for an account!<https://www.civo.com/signup>