[Openstack-operators] ha queues Juno periodic rabbitmq errors

Kevin Bringard (kevinbri) kevinbri at cisco.com
Thu May 14 15:49:03 UTC 2015



On 5/14/15, 9:45 AM, "Pedro Sousa" <pgsousa at gmail.com> wrote:

>Hi Kevin,
>
>
>thank you for reply, I'm using rabbitmqctl set_policy HA '^(?!amq\.).*'
>'{"ha-mode": "all"}'
>
>
>I will test with "ha-sync-mode":"automatic"' and net.ipv4.tcp_retries2=5

I don't know that you need to ha-sync-mode to automatic (I was just using
an example I found quickly on the internets), but I do think the
tcp_retries2 thing will help. I think we may have even set ours to 3...
I'd have to check. But, fiddle with it to the point that it times out
connections quickly, without having false positives.

There's a doc RH wrote about it. It's specific to Oracle, but should be
portable.

https://www.redhat.com/promo/summit/2010/presentations/summit/decoding-the-
code/fri/scott-945-tuning/summit_jbw_2010_presentation.pdf

>
>
>Regards,
>Pedro Sousa
>
>
>
>
>
>
>
>
>
>
>
>
>On Thu, May 14, 2015 at 4:29 PM, Kevin Bringard (kevinbri)
><kevinbri at cisco.com> wrote:
>
>If you're using Rabbit 3.x you need to enable HA queues via policy on the
>rabbit server side.
>
>Something like this:
>
>rabbitmqctl set_policy ha-all ""
>'{"ha-mode":"all","ha-sync-mode":"automatic"}'
>
>
>Obviously, tailor it to your own needs :-)
>
>We've also seen issues with TCP_RETRIES2 needing to be turned way down
>because when rebooting the rabbit node, it takes quite some time for the
>remote host to realize it's gone and tear down the connections.
>
>On 5/14/15, 9:23 AM, "Pedro Sousa" <pgsousa at gmail.com> wrote:
>
>>Hi all,
>>
>>
>>I'm using Juno and ocasionally see this kind of errors when I reboot one
>>of my rabbit nodes:
>>
>>
>>"MessagingTimeout: Timed out waiting for a reply to message ID
>>e95d4245da064c779be2648afca8cdc0"
>>
>>
>>I use ha queues in my openstack services:
>>
>>
>>rabbit_hosts=192.168.113.206:5672 <http://192.168.113.206:5672>
>><http://192.168.113.206:5672>,192.168.113.207:5672
>><http://192.168.113.207:5672>
>><http://192.168.113.207:5672>,192.168.113.208:5672
>><http://192.168.113.208:5672>
>><http://192.168.113.208:5672>
>>
>>rabbit_ha_queues=True
>>
>>
>>
>>As anyone experienced this issues? is this a oslo bug or related?
>>
>>
>>Regards,
>>Pedro Sousa
>>
>>
>>
>>
>>
>>
>>
>
>
>
>
>
>
>
>




More information about the OpenStack-operators mailing list