[openstack-dev] [Heat]Blueprint for retry function with idenpotency in Heat

Steven Hardy shardy at redhat.com
Fri Oct 18 09:34:11 UTC 2013


On Fri, Oct 18, 2013 at 12:13:45PM +1300, Steve Baker wrote:
> On 10/18/2013 01:54 AM, Mitsuru Kanabuchi wrote:
> > Hello Mr. Clint,
> >
> > Thank you for your comment and prioritization.
> > I'm glad to discuss you who feel same issue.
> >
> >> I took the liberty of targeting your blueprint at icehouse. If you don't
> >> think it is likely to get done in icehouse, please raise that with us at
> >> the weekly meeting if you can and we can remove it from the list.
> > Basically, this blueprint is targeted IceHouse release.
> >
> > However, the schedule is depend on follows blueprint:
> >   https://blueprints.launchpad.net/nova/+spec/idempotentcy-client-token
> >
> > We're going to start implementation to Heat after ClientToken implemented.
> > I think ClientToken is necessary function for this blueprint, and important function for other callers!
> Can there not be a default retry implementation which deletes any
> ERRORed resource and attempts the operation again? Then specific
> resources can switch to ClientToken as they become available.

Yes, I think this is the way to go - have logic in every resources
handle_update (which would probably be common with check_create_complete),
which checks the status of the underlying physical resource, and if it's
not in the expected status, we replace it.

This probably needs to be a new flag or API operation, as it clearly has
the possibility to be more destructive than a normal update (may delete
resources which have not changed in the template, but are in a bad state)

> > On Wed, 16 Oct 2013 23:32:22 -0700
> > Clint Byrum <clint at fewbar.com> wrote:
> >
> >> Excerpts from Mitsuru Kanabuchi's message of 2013-10-16 04:47:08 -0700:
> >>> Hi all,
> >>>
> >>> We proposed a blueprint that supports API retry function with idenpotency for Heat.
> >>> Prease review the blueprint.
> >>>
> >>>   https://blueprints.launchpad.net/heat/+spec/support-retry-with-idempotency
> >>>
> >> This looks great. It addresses some of what I've struggled with while
> >> thinking of how to handle the retry problem.
> >>
> >> I went ahead and linked bug #1160052 to the blueprint, as it is one that
> >> I've been trying to get a solution for.
> >>
> >> I took the liberty of targeting your blueprint at icehouse. If you don't
> >> think it is likely to get done in icehouse, please raise that with us at
> >> the weekly meeting if you can and we can remove it from the list.
> >>
> >> Note that there is another related blueprint here:
> >>
> >> https://blueprints.launchpad.net/heat/+spec/retry-failed-update
> >>
> >>
> 
> Has any thought been given to where the policy should be specified for
> how many retries to attempt?
> 
> Maybe sensible defaults should be defined in the python resources, and a
> new resource attribute can allow an override in the template on a
> per-resource basis (I'm referring to an attribute at the same level as
> Type, Properties, Metadata)

IMO we don't want to go down the path of retry-loops in Heat, or scheduled
self-healing. We should just allow the user to trigger an stack update from
a failed state (CREATE_FAILED, or UPDATE_FAILED), and then they can define
their own policy on when recovery happens by triggering a stack update.

This is basically what's described for discussion here:
http://summit.openstack.org/cfp/details/95

I personally think the scheduled self-healing is a bad idea, but the
convergence (as a special type of stack update) is a good one.

For automatic recovery, we should instead be looking at triggering things
via Ceilometer alarms, so we can move towards removing all periodic task
stuff from Heat (because it doesn't scale, and it presents major issues
when scaling out)

Steve



More information about the OpenStack-dev mailing list