[openstack-dev] [Heat] [Horizon] Precursor to Phase 1 Convergence
nikunj.aggarwal at hp.com
Wed Jan 14 13:17:52 UTC 2015
Horizon also needs the retry option. It solves a major problem specified in this blueprint: https://blueprints.launchpad.net/horizon/+spec/relaunch-failed-stack
From: Patil, Anant (HP Converged Cloud R&D)
Sent: Wednesday, January 14, 2015 5:54 PM
To: openstack-dev at lists.openstack.org
Subject: Re: [openstack-dev] [Heat] Precursor to Phase 1 Convergence
On 09-Jan-15 19:19, Zane Bitter wrote:
> On 09/01/15 01:07, Angus Salkeld wrote:
>> I am not in favor of the --continue as an API. I'd suggest responding
>> to resource timeouts and if there is no response from the task, then
>> re-start (continue) the task.
> Yeah, I am not in favour of a new API either. In fact, I believe we
> already have this functionality: if you do another update with the
> same template and parameters then it will break the lock and continue
> the update if the engine running the previous update has failed. And
> when we switch over to convergence it will still do the Right Thing
> without any extra implementation effort.
> There is one improvement we can make to the API though: in Juno, Ton
> added a PATCH method to stack update such that you can reuse the
> existing parameters without specifying them again. We should extend
> this to the template also, so you wouldn't have to supply any data to
> get Heat to start another update with the same template and parameters.
> I'm not sure if there is a blueprint for this already; co-ordinate
> with Ton if you are planning to work on it.
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
IMHO, there are two different things here:
1. Failures external to Heat engine (or out-of-band failures). A convenient way to issue a stack-update on a stack that fails due to out-of-band failures is needed. When a stack fails due to service unavailability or infrastructure issues, operators/admins can fix those issues and then re-start the provisioning or tell users to restart.
Currently, it is done by issuing a stack-update on the failed stack. It will be convenient to have an option to the stack-update command to retry the stack operation without having to specify the templates and parameters + environment again. User shouldn't need to supply any data again to start the update of failed stack.
Folks working in Horizon would definitely need something like this.
Horizon UI need not save a local copy of template and parameter + environment supplied by user, but rely on Heat because Heat already has the data. It would be convenient for Horizon to issue a --retry for stack-create or stack-update when the stack fails due to external problems that the operators/users fix.
2. Internal failures (or Heat engine failing). Continuing a stack operation even after a Heat engine fails due to internal error. I think Vishnu is talking about this part. When an engine fails, other engines should be able to take up the task of provisioning the stack without any user intervention. No new API or any option to stack-create or update is needed. Something like a periodic timer is needed to check if the engine provisioning a stack is up. If not, the lock is stolen and stack is restarted... may be by again issuing stack-update with same template and parameters. This is like an interim solution to continuous observer...the stack timer would periodically check for stacks that are "stuck" because the engine failed and issue another update or something equivalent to proceed with other Heat engines. Or as a first step, like Steve said, put the stack to FAILED state and let user initiate a stack-update (probably with the option specified 1).
Please share your thoughts.
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe<mailto:OpenStack-dev-request at lists.openstack.org?subject:unsubscribe>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OpenStack-dev