[openstack-dev] [Fuel-dev] [Fuel][RabbitMQ] nova-compute stuck for a while (AMQP)

Bogdan Dobrelya bdobrelia at mirantis.com
Thu May 8 11:22:56 UTC 2014


On 05/06/2014 10:42 PM, Roman Sokolkov wrote:
> Hello, fuelers.
> 
> I'm using Fuel 4.1A + Havana in HA mode.
> 
> I permanently observe (on other deployments also) issue with stuck
> "nova-compute" service. But i think problem is more fundamental and
> relates to HA RabbitMQ and OpenStack AMQP driver implementation.
> 
> Symptoms:
> 
>   * Random nova-compute from time to time marked as "XXX" for a while.
>   * I see that service itself works properly. In logs i see that it
>     sends status updates to conductor. But actually nothing is sent.
>   * "netstat" shows that all connections to/from rabbit "ESTABLISHED"
>   * rabbitmqctl shows that "compute.node-x" queue synced to all slaves.
>   * nothing has been broken before, i mean rabbitmq cluster, etc.
> 
> Axe style solution:
> 
>   * /etc/init.d/openstack-nova-compute restart
> 
> So here i've found a lot of interesting stuff (and solutions):
> 
>     https://bugs.launchpad.net/oslo.messaging/+bug/856764
> 
> 
> My questions are:
> 
>   * Are there any thoughts particular for Fuel to solve/workaround this
>     issue?
>   * Any fast solution for this in 4.1? Like adjust TCP keep-alive  timeouts?
> 
> 

I submitted an issue for Fuel
https://bugs.launchpad.net/fuel/+bug/1317488 and assigned it to Fuel
hardening team. Feel free to update it as appropriate.

> -- 
> Roman Sokolkov,
> Deployment Engineer,
> Mirantis, Inc.
> Skype rsokolkov,
> rsokolkov at mirantis.com <mailto:rsokolkov at mirantis.com>
> 
> 


-- 
Best regards,
Bogdan Dobrelya,
Skype #bogdando_at_yahoo.com
Irc #bogdando



More information about the OpenStack-dev mailing list