[nova][neutron][oslo][ops] rabbit bindings issue

Massimo Sgaravatto massimo.sgaravatto at gmail.com
Sat Aug 8 07:36:40 UTC 2020


We also see the issue.  When it happens stopping and restarting the rabbit
cluster usually helps.

I thought the problem was because of a wrong setting in the openstack
services conf files: I missed these settings (that I am now going to add):

[oslo_messaging_rabbit]
rabbit_ha_queues = true
amqp_durable_queues = true

Cheers, Massimo


On Sat, Aug 8, 2020 at 6:34 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:

> Hi,
>
> we also have this issue.
>
> Our solution was (up to now) to delete the queues with a script or even
> reset the complete cluster.
>
> We just upgraded rabbitmq to the latest version - without luck.
>
> Anyone else seeing this issue?
>
>  Fabian
>
>
>
> Arnaud Morin <arnaud.morin at gmail.com> schrieb am Do., 6. Aug. 2020, 16:47:
>
>> Hey all,
>>
>> I would like to ask the community about a rabbit issue we have from time
>> to time.
>>
>> In our current architecture, we have a cluster of rabbits (3 nodes) for
>> all our OpenStack services (mostly nova and neutron).
>>
>> When one node of this cluster is down, the cluster continue working (we
>> use pause_minority strategy).
>> But, sometimes, the third server is not able to recover automatically
>> and need a manual intervention.
>> After this intervention, we restart the rabbitmq-server process, which
>> is then able to join the cluster back.
>>
>> At this time, the cluster looks ok, everything is fine.
>> BUT, nothing works.
>> Neutron and nova agents are not able to report back to servers.
>> They appear dead.
>> Servers seems not being able to consume messages.
>> The exchanges, queues, bindings seems good in rabbit.
>>
>> What we see is that removing bindings (using rabbitmqadmin delete
>> binding or the web interface) and recreate them again (using the same
>> routing key) brings the service back up and running.
>>
>> Doing this for all queues is really painful. Our next plan is to
>> automate it, but is there anyone in the community already saw this kind
>> of issues?
>>
>> Our bug looks like the one described in [1].
>> Someone recommands to create an Alternate Exchange.
>> Is there anyone already tried that?
>>
>> FYI, we are running rabbit 3.8.2 (with OpenStack Stein).
>> We had the same kind of issues using older version of rabbit.
>>
>> Thanks for your help.
>>
>> [1]
>> https://groups.google.com/forum/#!newtopic/rabbitmq-users/rabbitmq-users/zFhmpHF2aWk
>>
>> --
>> Arnaud Morin
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200808/c8c8826a/attachment.html>


More information about the openstack-discuss mailing list