[openstack-dev] [Heat] Precursor to Phase 1 Convergence

Murugan, Visnusaran visnusaran.murugan at hp.com
Fri Jan 9 05:22:55 UTC 2015


My reasoning to have a “--continue” like functionality was to run it as a periodic task and substitute continuous observer for now.

“--continue” based command should work on realized vs. actual graph and issue a stack update.

I completely agree that user action should not be needed to realize a partially completed stack.

Your thoughts.

From: vishnu [mailto:ckmvishnu at gmail.com]
Sent: Friday, January 9, 2015 10:08 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Heat] Precursor to Phase 1 Convergence


Auto recovery is the plan. Engine failure should be detected by way of heartbeat or recover partially realised stack on engine startup in case of a single engine scenario.

"--continue" command was just a additional helper api.


Visnusaran Murugan

On Thu, Jan 8, 2015 at 11:29 PM, Steven Hardy <shardy at redhat.com<mailto:shardy at redhat.com>> wrote:
On Thu, Jan 08, 2015 at 09:53:02PM +0530, vishnu wrote:
>    Hi Zane,
>    I was wondering if we could push changes relating to backup stack removal
>    and to not load resources as part of stack. There needs to be a capability
>    to restart jobs left over by dead engines.A
>    something like heat stack-operation --continue [git rebase --continue]

To me, it's pointless if the user has to restart the operation, they can do
that already, e.g by triggering a stack update after a failed stack create.

The process needs to be automatic IMO, if one engine dies, another engine
should detect that it needs to steal the lock or whatever and continue
whatever was in-progress.

>    Had a chat with shady regarding this. IMO this would be a valuable
>    enhancement. Notification based lead sharing can be taken up upon
>    completion.

I was referring to a capability for the service to transparently recover
if, for example, a heat-engine is restarted during a service upgrade.

Currently, users will be impacted in this situation, and making them
manually restart failed operations doesn't seem like a super-great solution
to me (like I said, they can already do that to some extent)


OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150109/e35b268e/attachment.html>

More information about the OpenStack-dev mailing list