[Openstack] Heat - engine high-availability and stack management

Marica Antonacci marica.antonacci at gmail.com
Wed Jul 23 12:38:31 UTC 2014


Dear all, 

I would like to ask you for some insight about how heat handles the changes of the stack states, in particular what happens if the daemon "heat-engine" crashes during the creation of a stack.

Currently, I am implementing the high availability of Heat services using load balancing (haproxy) for the api processes (heat-api and heat-api-cfn) and pacemaker/corosync resource manager for heat-engine process. I’ve tested both the active/passive and the active/active configurations of the heat engine (the api services and engines run on 2 different nodes and use the same mysql & rabbitmq backends) and find out that if the engine instance that started to create the stack crashes before completion the stack is never finalized (it remains in “In Progress” status). 

In other words, the two heat-engine instances are able to operate only on the stack as a whole (a stack can be created by engine-1 and can be delete by engine-2): it seems that the stack is handled as an “atomic transaction” and therefore the multiple engines feature cannot be used to fully implement the high availability of the heat (creation stack) service because the single action inside the workflow cannot be handed off from one engine instance to another. 

I’ve seen that there are some blueprints (e.g. https://review.openstack.org/#/c/96404/3/specs/convergence-engine.rst) addressing this issue: does anyone know which is the current state of their implementation? Will they be ready for Juno or (hopefully) earlier? 

Thanks in advance 
Best,
Marica 
      

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20140723/471e3141/attachment.html>


More information about the Openstack mailing list