<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 2012年12月19日 20:05, unicell wrote:<br>
</div>
<blockquote
cite="mid:CAM_HZbC3VUNBU1w5x=V+i=ZSG_TorkSFQQrJCwrf6Y6PzOPr6w@mail.gmail.com"
type="cite">Hi,
<div><br>
</div>
<div>I'm running into an AMQP messaging issue, which caused
'run_instance' RPC never invoked at nova-compute side. It very
rare to happen, and wish someone could shed me some light to
follow on and debug into it. </div>
<div><br>
</div>
<div>SYMPTOMS</div>
<div>--</div>
<div>* 10.81.44.230 is the controller node, which runs RabbitMQ,
MySQL and Nova-API</div>
<div>* 10.46.178.20 is the compute node, which runs nova-compute</div>
<div>
<div>* nova boot --image <imageid> --flavor
<flavorid> test-server, and server running never receive
the message</div>
</div>
<div><br>
</div>
<div>* Message (from scheduler) casted to this nova-compute host
never got consumed ( 2 more message left)</div>
<div>
<div>* and '0' consumers listed from RabbitMQ perspective
(should be '1' in consumers coloumn)</div>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><a class="moz-txt-link-abbreviated" href="mailto:root@10.81.44.230:~#">root@10.81.44.230:~#</a>
rabbitmqctl list_queues name messages_ready
messages_unacknowledged consumers memory<br>
...<br>
compute.10.46.178.20 2 0 0 34504<br>
...</blockquote>
<div>
<div><br>
</div>
<div>
<div>* Connection to RabbitMQ server still in ESTABLISHED
state</div>
</div>
<div>[<a moz-do-not-send="true" href="mailto:root@10.46.178.20">root@10.46.178.20</a>
log]# lsof -i | grep nova</div>
</div>
<div>
<div>nova-comp 4498 stack 13u IPv4 180448 0t0 TCP
10.46.178.20:42974->10.81.44.230:mysql (ESTABLISHED)</div>
<div>nova-comp 4498 stack 14u IPv4 21119 0t0 TCP
10.46.178.20:51564->10.81.44.230:amqp (ESTABLISHED)</div>
<div>nova-comp 4498 stack 15u IPv4 21721 0t0 TCP
10.46.178.20:51570->10.81.44.230:amqp (ESTABLISHED)</div>
</div>
</blockquote>
Could you also paste the result of "netstat -ant | grep -E
'Recv-Q|5672'" ?<br>
Maybe its Recv-Q is full. The user space program which is
nova-compute here can't consume TCP buffer any more.<br>
While its Recv-Q is full, the TCP connection is ESTABLISHED but no
data can be transferred.<br>
<blockquote
cite="mid:CAM_HZbC3VUNBU1w5x=V+i=ZSG_TorkSFQQrJCwrf6Y6PzOPr6w@mail.gmail.com"
type="cite">
<div>
<div><br>
</div>
<div>* RabbitMQ port check from compute node "nc -vz
10.81.44.230 5672" returns succeed</div>
<div>* Scheduler (10.81.44.230) can still receive compute servce
update from compute node (10.46.178.20) via message queue</div>
<div><br>
</div>
<div>* Restart nova-compute can resolve the issue.</div>
<div><br>
</div>
<div>QUESTIONS</div>
<div>--</div>
<div>It is very rare to happen and hard to reproduce. Once it
happens,</div>
<div>1. Which portion should I check or look into?</div>
</div>
</blockquote>
I think it's nova-compute, we need to debug it while it is running.<br>
<blockquote
cite="mid:CAM_HZbC3VUNBU1w5x=V+i=ZSG_TorkSFQQrJCwrf6Y6PzOPr6w@mail.gmail.com"
type="cite">
<div>
<div>2. How can I check if _consumer_thread eventlet is still
trying to consume the message? Afterall "rabbitmqctl
list_queues consumers" prints 0 for this compute.host queue.</div>
</div>
</blockquote>
tcpdump?<br>
<blockquote
cite="mid:CAM_HZbC3VUNBU1w5x=V+i=ZSG_TorkSFQQrJCwrf6Y6PzOPr6w@mail.gmail.com"
type="cite">
<div>
<div>3. Is there any way to restore the message consumption
without restarting nova-compute service?</div>
</div>
</blockquote>
Not sure until we know how to reproduce the bug.<br>
<blockquote
cite="mid:CAM_HZbC3VUNBU1w5x=V+i=ZSG_TorkSFQQrJCwrf6Y6PzOPr6w@mail.gmail.com"
type="cite">
<div>
<div><br>
</div>
<div>Thanks!</div>
<div><br>
</div>
<div>Best Regards,</div>
<div>--</div>
</div>
<div>
<div>Qiu Yu<br>
<a moz-do-not-send="true" href="http://www.unicell.info">http://www.unicell.info</a></div>
<br>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
OpenStack-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a>
<a class="moz-txt-link-freetext" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a>
</pre>
</blockquote>
<br>
<br>
<pre class="moz-signature" cols="72">--
Jian Wen
Software Engineer, Services and Support Team
Canonical, Ltd</pre>
</body>
</html>