[openstack-dev] [nova] Rabbit-mq 3.4 crashing (anyone else seen this?)

Alexey Lebedev alebedev at mirantis.com
Tue Jul 5 17:50:02 UTC 2016


Hi Joshua,

Does this happen with `rates_mode` set to `none` and tuned
`collect_statistics_interval`? Like in
https://bugs.launchpad.net/fuel/+bug/1510835

High connection/channel churn during upgrade can cause such issues.

BTW, soon-to-be-released rabbitmq 3.6.3 contains several improvements
related to management plugin statistics handling. And almost every version
before that also contained some related fixes. And I think that upstream
devs response will have some mention of upgrade =)

Best,
Alexey

On Tue, Jul 5, 2016 at 8:02 PM, Joshua Harlow <harlowja at fastmail.com> wrote:

> Hi ops and dev-folks,
>
> We over at godaddy (running rabbitmq with openstack) have been hitting a
> issue that has been causing the `rabbit_mgmt_db` consuming nearly all the
> processes memory (after a given amount of time),
>
> We've been thinking that this bug (or bugs?) may have existed for a while
> and our dual-version-path (where we upgrade the control plane and then
> slowly/eventually upgrade the compute nodes to the same version) has
> somehow triggered this memory leaking bug/issue since it has happened most
> prominently on our cloud which was running nova-compute at kilo and the
> other services at liberty (thus using the versioned objects code path more
> frequently due to needing translations of objects).
>
> The rabbit we are running is 3.4.0 on CentOS Linux release 7.2.1511 with
> kernel 3.10.0-327.4.4.el7.x86_64 (do note that upgrading to 3.6.2 seems to
> make the issue go away),
>
> # rpm -qa | grep rabbit
>
> rabbitmq-server-3.4.0-1.noarch
>
> The logs that seem relevant:
>
> ```
> **********************************************************
> *** Publishers will be blocked until this alarm clears ***
> **********************************************************
>
> =INFO REPORT==== 1-Jul-2016::16:37:46 ===
> accepting AMQP connection <0.23638.342> (127.0.0.1:51932 -> 127.0.0.1:5671
> )
>
> =INFO REPORT==== 1-Jul-2016::16:37:47 ===
> vm_memory_high_watermark clear. Memory used:29910180640 allowed:47126781542
> ```
>
> This happens quite often, the crashes have been affecting our cloud over
> the weekend (which made some dev/ops not so happy especially due to the
> july 4th mini-vacation),
>
> Looking to see if anyone else has seen anything similar?
>
> For those interested this is the upstream bug/mail that I'm also seeing
> about getting confirmation from the upstream users/devs (which also has
> erlang crash dumps attached/linked),
>
> https://groups.google.com/forum/#!topic/rabbitmq-users/FeBK7iXUcLg
>
> Thanks,
>
> -Josh
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Best,
Alexey
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160705/298cf7dd/attachment.html>


More information about the OpenStack-dev mailing list