[openstack-dev] [Heat] [TripleO] Rolling updates spec re-written. RFC

Clint Byrum clint at fewbar.com
Mon Feb 3 19:51:05 UTC 2014


Excerpts from Robert Collins's message of 2014-02-03 10:47:06 -0800:
> Quick thoughts:
> 
>  - I'd like to be able to express a minimum service percentage: e.g. I
> know I need 80% of my capacity available at anyone time, so an
> additional constraint to the unit counts, is to stay below 20% down at
> a time (and this implies that if 20% have failed, either stop or spin
> up more nodes before continuing).
> 

Right will add that.

One thing though, all failures lead to rollback. I put that in the
'Unresolved issues' section. Continuing a group operation with any
failures is an entirely different change to Heat. We have a few choices,
from a whole re-thinking of how we handle failures, to just a special
type of resource group that tolerates failure percentages.

> The wait condition stuff seems to be conflating in the 'graceful
> operations' stuff we discussed briefly at the summit, which in my head
> at least is an entirely different thing - it's per node rather than
> per group. If done separately that might make each feature
> substantially easier to reason about.

Agreed. I think something more generic than an actual Heat wait condition
would make more sense. Perhaps even returning all of the active scheduler
tasks which the update must wait on would make sense. Then in the
"graceful update" version we can just make the dynamically created wait
conditions depend on the update pattern, which would have the same effect.

With the "maximum out of service" addition, we'll also need to make sure
that upon the "must wait for these" things completing we evaluate state
again before letting the update proceed.



More information about the OpenStack-dev mailing list