[openstack-dev] [Fuel] HA cluster disk monitoring, failover and recovery

Bogdan Dobrelya bdobrelia at mirantis.com
Tue Nov 17 16:03:36 UTC 2015


On 17.11.2015 15:28, Kyrylo Galanov wrote:
> Hi Team,

Hello

> 
> I have been testing fail-over after free disk space is less than 512 mb.
> (https://review.openstack.org/#/c/240951/)
> Affected node is stopped correctly and services migrate to a healthy node.
> 
> However, after free disk space is more than 512 mb again the node does
> not recover it's state to operating. Moreover, starting the resources
> manually would rather fail. In a nutshell, the pacemaker service / node
> should be restarted. Detailed information is available
> here: https://www.suse.com/documentation/sle_ha/book_sleha/data/sec_ha_configuration_basics_monitor_health.html
> 
> How do we address this issue?

According to the docs you provided,
" After a node's health status has turned to red, solve the issue that
led to the problem. Then clear the red status to make the node eligible
again for running resources. Log in to the cluster node and use one of
the following methods:

    Execute the following command:

    crm node status-attr NODE delete #health_disk

    Restart OpenAIS on that node.

    Reboot the node.

The node will be returned to service and can run resources again. "

So this looks like an expected behaviour!

What else could be done:
- We should check if we have this nuance documented, and submit a bug to
fuel-docs team, if not yet there.
- Submitting a bug and inspecting logs would be nice to do as well.
I believe some optimizations may be done, bearing in mind this pacemaker
cluster-recheck-interval and failure-timeout story [0].

[0]
http://blog.kennyrasschaert.be/blog/2013/12/18/pacemaker-high-failability/

> 
> 
> Best regards,
> Kyrylo
> 
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 


-- 
Best regards,
Bogdan Dobrelya,
Irc #bogdando



More information about the OpenStack-dev mailing list