Slow instance launch times due to RabbitMQ

Gabriele Santomaggio g.santomaggio at gmail.com
Wed Jul 31 15:48:57 UTC 2019


Hi,
Are you using ssl connections ?

Can be this issue ?
https://bugs.launchpad.net/ubuntu/+source/oslo.messaging/+bug/1800957



________________________________
From: Laurent Dumont <laurentfdumont at gmail.com>
Sent: Wednesday, July 31, 2019 4:20 PM
To: Grant Morley
Cc: openstack-operators at lists.openstack.org
Subject: Re: Slow instance launch times due to RabbitMQ

That is a bit strange, list_queues should return stuff. Couple of ideas :

  *   Are the Rabbit connection failure logs on the compute pointing to a specific controller?
  *   Are there any logs within Rabbit on the controller that would point to a transient issue?
  *   cluster_status is a snapshot of the cluster at the time you ran the command. If the alarms have cleared, you won't see anything.
  *   If you have the RabbitMQ management plugin activated, I would recommend a quick look to see the historical metrics and overall status.

On Wed, Jul 31, 2019 at 9:35 AM Grant Morley <grant at civo.com<mailto:grant at civo.com>> wrote:

Hi guys,

We are using Ubuntu 16 and OpenStack ansible to do our setup.

rabbitmqctl list_queues
Listing queues

(Doesn't appear to be any queues )

rabbitmqctl cluster_status

Cluster status of node 'rabbit at management-1-rabbit-mq-container-b4d7791f'
[{nodes,[{disc,['rabbit at management-1-rabbit-mq-container-b4d7791f',
                'rabbit at management-2-rabbit-mq-container-b455e77d',
                'rabbit at management-3-rabbit-mq-container-1d6ae377']}]},
 {running_nodes,['rabbit at management-3-rabbit-mq-container-1d6ae377',
                 'rabbit at management-2-rabbit-mq-container-b455e77d',
                 'rabbit at management-1-rabbit-mq-container-b4d7791f']},
 {cluster_name,<<"openstack">>},
 {partitions,[]},
 {alarms,[{'rabbit at management-3-rabbit-mq-container-1d6ae377',[]},
          {'rabbit at management-2-rabbit-mq-container-b455e77d',[]},
          {'rabbit at management-1-rabbit-mq-container-b4d7791f',[]}]}]

Regards,

On 31/07/2019 11:49, Laurent Dumont wrote:
Could you forward the output of the following commands on a controller node? :

rabbitmqctl cluster_status
rabbitmqctl list_queues

You won't necessarily see a high load on a Rabbit cluster that is in a bad state.

On Wed, Jul 31, 2019 at 5:19 AM Grant Morley <grant at civo.com<mailto:grant at civo.com>> wrote:

Hi all,

We are randomly seeing slow instance launch / deletion times and it appears to be because of RabbitMQ. We are seeing a lot of these messages in the logs for Nova and Neutron:

ERROR oslo.messaging._drivers.impl_rabbit [-] [f4ab3ca0-b837-4962-95ef-dfd7d60686b6] AMQP server on 10.6.2.212:5671<http://10.6.2.212:5671> is unreachable: Too many heartbeats missed. Trying again in 1 seconds. Client port: 37098: ConnectionForced: Too many heartbeats missed

The RabbitMQ cluster isn't under high load and I am not seeing any packets drop over the network when I do some tracing.

We are only running 15 compute nodes currently and have >1000 instances so it isn't a large deployment.

Are there any good configuration tweaks for RabbitMQ running on OpenStack Queens?

Many Thanks,

--

[https://www.civo.com/images/email-logo.jpg]
Grant Morley
Cloud Lead, Civo Ltd
www.civo.com<https://www.civo.com/> | Signup for an account!<https://www.civo.com/signup>
--

[https://www.civo.com/images/email-logo.jpg]
Grant Morley
Cloud Lead, Civo Ltd
www.civo.com<https://www.civo.com/> | Signup for an account!<https://www.civo.com/signup>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190731/95db40df/attachment-0001.html>


More information about the openstack-discuss mailing list