[openstack-dev] [TripleO] Strategy for recovering crashed nodes in the Overcloud?

Ladislav Smola lsmola at redhat.com
Fri Jul 25 08:23:43 UTC 2014


Hi,

I believe you are looking for stack convergence in Heat. It's not fully 
implemented yet AFAIK.
You can check it out here 
https://blueprints.launchpad.net/heat/+spec/convergence

Hope it will help you.

Ladislav

On 07/23/2014 12:31 PM, Howley, Tom wrote:
>
> (Resending to properly start new thread.)
>
> Hi,
>
> I'm running a HA overcloud configuration and as far as I'm aware, 
> there is currently no mechanism in place for restarting failed nodes 
> in the cluster. Originally, I had been wondering if we would use a 
> corosync/pacemaker cluster across the control plane with STONITH 
> resources configured for each node (a STONITH plugin for Ironic could 
> be written). This might be fine if a corosync/pacemaker stack is 
> already being used for HA of some components, but it seems overkill 
> otherwise. The undercloud heat could be in a good position to restart 
> the overcloud nodes -- is that the plan or are there other options 
> being considered?
>
> Thanks,
>
> Tom
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140725/a22a388e/attachment.html>


More information about the OpenStack-dev mailing list