<div dir="ltr">Hello, fuelers.<div><br></div><div>I'm using Fuel 4.1A + Havana in HA mode.<br><div><br></div><div>I permanently observe (on other deployments also) issue with stuck "nova-compute" service. But i think problem is more fundamental and relates to HA RabbitMQ and OpenStack AMQP driver implementation.</div>
<div><br></div><div>Symptoms:</div><div><ul><li>Random nova-compute from time to time marked as "XXX" for a while.</li><li>I see that service itself works properly. In logs i see that it sends status updates to conductor. But actually nothing is sent.</li>
<li>"netstat" shows that all connections to/from rabbit "ESTABLISHED"</li><li>rabbitmqctl shows that "compute.node-x" queue synced to all slaves.<br></li><li>nothing has been broken before, i mean rabbitmq cluster, etc.</li>
</ul><div>Axe style solution:</div><div><ul><li>/etc/init.d/openstack-nova-compute restart</li></ul></div><div>So here i've found a lot of interesting stuff (and solutions):</div><div><br></div></div><blockquote style="margin:0 0 0 40px;border:none;padding:0px">
<div><div><a href="https://bugs.launchpad.net/oslo.messaging/+bug/856764">https://bugs.launchpad.net/oslo.messaging/+bug/856764</a></div></div></blockquote><div><div><br></div><div>My questions are:</div><div><ul><li>Are there any thoughts particular for Fuel to solve/workaround this issue?<br>
</li><li>Any fast solution for this in 4.1? Like adjust TCP keep-alive timeouts?</li></ul></div><div><br></div>-- <br><div dir="ltr">Roman Sokolkov,<div>Deployment Engineer,</div><div>Mirantis, Inc.<br>Skype rsokolkov,<br>
<a href="mailto:rsokolkov@mirantis.com" target="_blank">rsokolkov@mirantis.com</a><br></div></div>
</div></div></div>