[openstack-dev] [Heat] in-instance update hooks

Clint Byrum clint at fewbar.com
Tue Feb 11 10:38:17 UTC 2014


Excerpts from Thomas Spatzier's message of 2014-02-11 00:38:53 -0800:
> Hi Clint,
> 
> thanks for writing this down. This is a really interesting use case and
> feature, also in relation to what was recently discussed on rolling
> updates.
> 
> I have a couple of thoughts and questions:
> 
> 1) The overall idea seems clear to me but I have problems understanding the
> detailed flow and relation to template definitions and metadata. E.g. in
> addition to the examples you gave in the linked etherpad, where would the
> script or whatever sit that handles the update etc.
> 

At the risk of sounding curt and unfeeling: I really don't care how your
servers do their job or talk to heat.. as long as they use the API. :)

Heat is an orchestration tool, and putting script code or explicit tool
callouts in templates is obscene to me. I understand other people are
stuck with vanilla images, but I am not, so I have very little time to
spend thinking about how that is done.

In TripleO, os-collect-config periodically polls the metadata and runs
os-refresh-config when it changes. os-refresh-config just runs a bunch
of scripts. The scripts use os-apply-config to interpret the metadata
section, either through mustache templates or like this:

ACTION=$(os-apply-config --key action.pending --key-default '' --type raw)
case $ACTION in
rebuild)
delete)
  migrate_to_something_else
  ping_the_handle $(os-apply_config --key action.handle --type url)
  ;;
*)
  ;;
esac
  
We just bake all of this into our images.

> 2) I am not a big fan of CFN WaitConditions since they let too much
> programming shine thru in a template. So I wonder whether this could be
> made more transparent to the template writer. The underlying mechanism
> could still be the same, but maybe we could make the template look cleaner.
> For example, what Steve Baker is doing for software orchestration also uses
> the underlying mechanisms but does not expose WaitConditions in templates.
>

Yeah we could allocate the handle transparently without much difficulty.
However, to be debuggable we'd have to make it available as an attribute
of the server if we don't have a resource directly queryable for it. Not
sure if Steve has done that but it is pretty important to be able to
compare the handle URL you see in the stack to the one you see on the
server.

> 3) Has the issue of how to express update policies on the rolling updates
> thread been resolved? I followed that thread but seems like there has not
> been a final decision. The reason I am bringing this up is because I think
> this is related. You are suggesting to establish a new top-level section
> 'action_hooks' in a resource. Rendering this top-level in the resource is a
> good thing IMO. However, since this is related to updates in a way (you
> want to react to any kind of update event to the resource's state), I
> wonder if those hooks could be attributes of an update policy. UpdatePolicy
> in CFN is also a top-level section in a resource and they seem to provide a
> default one like the following (I am writing this in snake case as we would
> render it in HOT:
> 
> resources:
>   autoscaling_group1:
>     type: AWS::AutoScaling::AutoScalingGroup
>     properties:
>       # the properties ...
>     update_policy:
>       auto_scaling_rolling_update:
>         min_instances_in_server: 1
>         max_batch_size: 1
>         pause_time: PT12M5S
> 
> (I took this from the CFN user guide).
> I.e. an update policy already is a complex data structure, and we could
> define additional types that include the resource hooks definitions you
> need. ... I don't fully understand the connection between 'actions' and
> 'path' in your etherpad example yet, so cannot define a concrete example,
> but I hope you get what I wanted to express.
> 

This also works on stack-delete. Perhaps delete is just a special update
that replaces the template with '', but update_policy seems a bit off
base given this. The two features (rolling-updates and update-hooks)
seem related, but I think only because they'd both be more useful with
the other available.

> 4) What kind of additional metadata for the update events are you thinking
> about? For example, in case this is done in an update case with a batch
> size of > 1 (i.e. you update multiple members in a cluster at a time) -
> unless I put too much interpretation in here concerning the relation to
> rolling updates - you would probably want to tell the server a black list
> of servers to which it should not migrate workload, because they will be
> taken down as well.
> 

Agreed, for rolling updates we'd need some additional clues. For just
doing explicit servers, we can think a little more statically and just
do a bit shift of the workloads (first server sends to last.. second to
first.. etc).

> 
> As I said, just a couple of thoughts, and maybe for some I am just
> mis-understanding some details.
> Anyway, I would be interested in your view.
> 

Your thoughts are most appreciated!



More information about the OpenStack-dev mailing list