[Openstack-operators] [nova] Rabbit-mq 3.4 crashing (anyone else seen this?)

Matt Fischer matt at mattfischer.com
Tue Jul 5 17:29:58 UTC 2016


For the record we're on 3.5.6-1.
On Jul 5, 2016 11:27 AM, "Mike Lowe" <jomlowe at iu.edu> wrote:

> I was having just this problem last week.  We updated to 3.6.2 from 3.5.4
> on ubuntu and stated seeing crashes due to excessive memory usage. I did
> this on each node of my rabbit cluster and haven’t had any problems since
> 'rabbitmq-plugins disable rabbitmq_management’.  From what I could gather
> from rabbitmq mailing lists the stats collection part of the management
> console is single threaded and can’t keep up thus the ever growing memory
> usage from the ever growing backlog of stats to be processed.
>
>
> > On Jul 5, 2016, at 1:02 PM, Joshua Harlow <harlowja at fastmail.com> wrote:
> >
> > Hi ops and dev-folks,
> >
> > We over at godaddy (running rabbitmq with openstack) have been hitting a
> issue that has been causing the `rabbit_mgmt_db` consuming nearly all the
> processes memory (after a given amount of time),
> >
> > We've been thinking that this bug (or bugs?) may have existed for a
> while and our dual-version-path (where we upgrade the control plane and
> then slowly/eventually upgrade the compute nodes to the same version) has
> somehow triggered this memory leaking bug/issue since it has happened most
> prominently on our cloud which was running nova-compute at kilo and the
> other services at liberty (thus using the versioned objects code path more
> frequently due to needing translations of objects).
> >
> > The rabbit we are running is 3.4.0 on CentOS Linux release 7.2.1511 with
> kernel 3.10.0-327.4.4.el7.x86_64 (do note that upgrading to 3.6.2 seems to
> make the issue go away),
> >
> > # rpm -qa | grep rabbit
> >
> > rabbitmq-server-3.4.0-1.noarch
> >
> > The logs that seem relevant:
> >
> > ```
> > **********************************************************
> > *** Publishers will be blocked until this alarm clears ***
> > **********************************************************
> >
> > =INFO REPORT==== 1-Jul-2016::16:37:46 ===
> > accepting AMQP connection <0.23638.342> (127.0.0.1:51932 ->
> 127.0.0.1:5671)
> >
> > =INFO REPORT==== 1-Jul-2016::16:37:47 ===
> > vm_memory_high_watermark clear. Memory used:29910180640
> allowed:47126781542
> > ```
> >
> > This happens quite often, the crashes have been affecting our cloud over
> the weekend (which made some dev/ops not so happy especially due to the
> july 4th mini-vacation),
> >
> > Looking to see if anyone else has seen anything similar?
> >
> > For those interested this is the upstream bug/mail that I'm also seeing
> about getting confirmation from the upstream users/devs (which also has
> erlang crash dumps attached/linked),
> >
> > https://groups.google.com/forum/#!topic/rabbitmq-users/FeBK7iXUcLg
> >
> > Thanks,
> >
> > -Josh
> >
> > _______________________________________________
> > OpenStack-operators mailing list
> > OpenStack-operators at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20160705/8094ce82/attachment.html>


More information about the OpenStack-operators mailing list