[openstack-dev] How to debug OpenStack RabbitMQ message not consumed issues?

unicell unicell at gmail.com
Wed Dec 19 12:05:06 UTC 2012


Hi,

I'm running into an AMQP messaging issue, which caused 'run_instance' RPC
never invoked at nova-compute side. It very rare to happen, and wish
someone could shed me some light to follow on and debug into it.

SYMPTOMS
--
* 10.81.44.230 is the controller node, which runs RabbitMQ, MySQL and
Nova-API
* 10.46.178.20 is the compute node, which runs nova-compute
* nova boot --image <imageid> --flavor <flavorid> test-server, and server
running never receive the message

* Message (from scheduler) casted to this nova-compute host never got
consumed ( 2 more message left)
* and '0' consumers listed from RabbitMQ perspective (should be '1' in
consumers coloumn)

> root at 10.81.44.230:~# rabbitmqctl list_queues name messages_ready
> messages_unacknowledged consumers memory
> ...
> compute.10.46.178.20  2       0       0       34504
> ...


* Connection to RabbitMQ server still in ESTABLISHED state
[root at 10.46.178.20 log]# lsof -i | grep nova
nova-comp  4498   stack   13u  IPv4 180448      0t0  TCP 10.46.178.20:42974
->10.81.44.230:mysql (ESTABLISHED)
nova-comp  4498   stack   14u  IPv4  21119      0t0  TCP 10.46.178.20:51564
->10.81.44.230:amqp (ESTABLISHED)
nova-comp  4498   stack   15u  IPv4  21721      0t0  TCP 10.46.178.20:51570
->10.81.44.230:amqp (ESTABLISHED)

* RabbitMQ port check from compute node "nc -vz 10.81.44.230 5672" returns
succeed
* Scheduler (10.81.44.230) can still receive compute servce update from
compute node (10.46.178.20) via message queue

* Restart nova-compute can resolve the issue.

QUESTIONS
--
It is very rare to happen and hard to reproduce. Once it happens,
1. Which portion should I check or look into?
2. How can I check if _consumer_thread eventlet is still trying to consume
the message? Afterall "rabbitmqctl list_queues consumers" prints 0 for this
compute.host queue.
3. Is there any way to restore the message consumption without restarting
nova-compute service?

Thanks!

Best Regards,
--
Qiu Yu
http://www.unicell.info
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20121219/5e3e2551/attachment.html>


More information about the OpenStack-dev mailing list