[Openstack-operators] Way to check compute <-> rabbitmq connectivity

Kris G. Lindgren klindgren at godaddy.com
Thu Jan 15 16:07:45 UTC 2015


+1 on this.

In general rabbitmq connectivity/failover is pretty terrible.  Services look to be connected to rabbitmq but in reality they aren't, monitoring on the server to see if it has an established connection to rabbitmq isn't enough. Our experience is pretty much the same on anything that is using rabbitmq - not just nova-compute.  The issue seems to be that it can send messages, but it doesn't actually pull messages from the queue.  Also, when we restart a rabbit node in the cluster, connections typically have issues re-establishing and we need to restart most services to fix the issue.
____________________________________________

Kris Lindgren
Senior Linux Systems Engineer
GoDaddy, LLC.



From: Gustavo Randich <gustavo.randich at gmail.com<mailto:gustavo.randich at gmail.com>>
Date: Thursday, January 15, 2015 at 8:34 AM
To: "openstack-operators at lists.openstack.org<mailto:openstack-operators at lists.openstack.org>" <openstack-operators at lists.openstack.org<mailto:openstack-operators at lists.openstack.org>>
Subject: [Openstack-operators] Way to check compute <-> rabbitmq connectivity

Hi,

I'm experiencing some issues with nova-compute services not responding to rabbitmq messages, despite the service reporting OK state via periodic tasks. Apparently the TCP connection is open but in a stale or unresponsive state. This happens sporadically when there is some not yet understood network problem. Restarting nova-compute solves the problem.

Is there any way, preferably via openstack API, to probe service responsiveness, i.e., that it consumes messages, so we can program an alert?

Thanks in advance!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20150115/df5bb24f/attachment.html>


More information about the OpenStack-operators mailing list