[openstack-dev] [Heat] Design summit preparation - Next steps for Heat Software Orchestration

Thomas Spatzier thomas.spatzier at de.ibm.com
Thu Apr 24 18:07:03 UTC 2014


Hi Clint,

thanks for the comments. Those are some really good points. I added some
more thoughts from my side below.

Exceprts from Clint Byrum's message on 24/04/2014 17:53:40:

<snip>
> > So in a short, stripped-down version, SoftwareConfigs could look like
> >
> > my_sw_config:
> >   type: OS::Heat::SoftwareConfig
> >   properties:
> >     create_config: # the hook for software install
> >     suspend_config: # hook for suspend action
> >     resume_config: # hook for resume action
> >     delete_config: # hook for delete action
> >
>
> First off, modeling more actions is definitely on the near-term need
> list for TripleO. We need to model rebuild and replace as an action too,
> so that we can evacuate/migrate work loads off of compute nodes.
>
> I think you can prototype action handling already with StructuredConfig,
> which allows you to define your own structure underneath config:.
> So you'd just simply do this:
>
> my_sw_config:
>   type: OS::Heat::StructuredConfig
>   properties:
>     config:
>       create_config: |
>         #!/bin/bash
>         execute_stuff
>       delete_config: |
>         #!/bin/bash
>         execute_delete_stuff

Interesting. I had not thought of StructuredConfig in that way yet. So if
the in-instance tool would be implemented in a way that when the
corresponding StructuredDeployment resource is in CREATE_IN_PROGRESS it
executes the create_config, and when it is in DELETE_IN_PROGRESS executes
the delete_config (does the tag-teaming between StructuredDeployment and
the in-instance tool do this today).
I would still slightly prefer a more explicit and top-level definition of
well-defined config hooks will well-defined semantics (i.e. at which
lifecycle step they get executed), though, as opposed to a "free-form" map.

>
> Then your in-instance tools would simply write these as executables and
> run them when the action dictates. Suspend and resume aren't currently
> exposed, but I think that would be a fairly trivial patch to enable.
>
> Note that IMO we should _not_ encourage users to embed executables in
> templates. In addition to muddying the waters between orchestration
> and configuration management, it is also a very poor model for long
> term source code control. You lose things like renames and per-file
> commit history. I think users will be much better served by other tools
> bundling software together with a template, and I've always figured that
> something like Murano or Solum would do that. So that you'd instead have
> a template like this:
>
>
> my_sw_config:
>   type: OS::Heat::StructuredConfig
>   properties:
>     config:
>       create_config: {get_file: create_hook.sh}
>       delete_config: {get_file: delete_hook.sh}

+1 on that! Actually, I posted a wordpress sample yesterday which exactly
does this:

https://review.openstack.org/#/c/89885/

>
> And if you're image-based, then you'd instead just bundle the image
> creation definition and not even need to include instructions on code
> injection in your template.
>
> > When such a SoftwareConfig gets associated to a server via
> > SoftwareDeployment, the SoftwareDeployment resource lifecycle
> > implementation could trigger the respective hooks defined in
SoftwareConfig
> > (if a hook is not defined, a no-op is performed). This way, all config
> > related to one piece of software is nicely defined in one place.
> >
>
> I don't believe "nicely defined in one place" is the right way to say
> this. I would call it an unmanageable chunk of yaml.

IMHO, an "unmanageable chunk of yaml" would be if the scripting is inlined.
If the scripts get referenced like mentioned above (I think we both agree
on that), it looks pretty neat to me.

>
> One point, there is no "no-op". Heat's software config has an interface,
> and in-instance tools do the work.  I want to make sure we always stay
> true to that. I don't want Heat to assume anything about what happens
> inside instances. Heat exposes configs, and optionally waits for a
> signal. Nothing more.

Ok, maybe "no-op" was not the right term, or maybe over-simplifying it. So
what I meant was:
If a SoftwareConfig defined a 'create_config', the SoftwareDeployment would
set its state accordingly to CREATE_IN_PROGRESS and then wait for a signal
from the in-instance tool. The in-instance tool would to the actual work,
signal the SoftwareDeployment which goes to CREATE_COMPLETE and so on. Same
for other operations (delete, suspend, ...).
If a SoftwareConfig did not define a certain config - let's take
'suspend_config' now - the SoftwareDeployment (which can check the content
of the associated SoftwareConfig) can just complete the task directly, i.e.
for SUSPEND directly go to SUSPEND_COMPLETE and avoid the whole round trip
of waiting for a signal from the in-instance tool (which found out it does
not have to do anything).

>
> >
> > #2 Enable add-hoc actions on software components:
> > Apart from basic resource lifecycle hooks, it would be desirable to
allow
> > for invocation of add-hoc actions on software. Examples would be the
ad-hoc
> > creation of DB backups, application of patches, or creation of users
for an
> > application. Such hooks (implemented as scripts, Chef recipes or Puppet
> > facts) could be defined in the same way as basic lifecycle hooks. They
> > could be triggered by doing property updates on the respective
> > SoftwareDeployment resources (just a thought and to be discussed during
> > design sessions).
> > I think this item could help bridging over to some discussions raised
by
> > the Murano team recently (my interpretation: being able to trigger
actions
> > from workflows). It would add a small feature on top of the current
> > software orchestration in Heat and keep definitions in one place. And
it
> > would allow triggering by something or somebody else (e.g. a workflow)
> > probably using existing APIs.
> >
>
> I'm a bit skeptical about anything to add to Heat that doesn't actually
> change the end-goal as expressed by the user. What I mean is, backups
> could certainly mean defining new volumes or swift containers and telling
> servers to put data in them, but it isn't clear that it doesn't also
> just mean copying all your data to a backup server already defined.

Yeah, point well-taken. I guess this needs discussion.

>
> The distinction is that if we are careful to always defer to workflow
> when workflow is required, then both cases have very clear points where
> you change the workflow (the part that copies the data) or where you
> change the orchestration (the part that manages the resources). And in
> both cases, you always trigger the workflow to start a backup, not the
> orchestration.
>
> So, I'm interested to see where this goes, but I would like to make sure
> that Heat stays out of the workflow business and instead continues to
> provide clear integration points for workflow.

Fully agree. And finding the right integration point if the key for this
discussion.

>
<snip>
> > #3.1 software deployment should run just once:
> > A bug has been raised because with today's implementation it can happen
> > that SoftwareDeployments get executed multiple times. There has been
some
> > discussion around this issue but no final conclusion. An average user
will
> > however assume that his automation gets run only or exactly once. When
> > using existing scripts, it would be an additional burden to require
> > rewrites to cope with multiple invocations. Therefore, we should have a
> > generic solution to the problem so that users do not have to deal with
this
> > complex problem.
> >
>
> This is entirely solvable by in-instance tools. Heat doesn't need to do
> anything to support this. And if we spend time on it, we'll just get in
> the way half the time. I mean, do we really need an API to avoid flag
> files?

I have the same feeling. The Heat engine should just care about the state
of the resources (i.e. if a SoftwareDeployment indicated success once, it
would be in CREATE_COMPLETE), and the in-instance tool should be smart
enough to not execute the same deployment twice. Maybe this is just a tweak
to the invocation hook that Steve has implemented today?

>
> [ -e /var/lib/mystuff/already.ran ] && exit 0
> # code here
> touch /var/lib/mystuff/already.ran
>
> For more sophisticated idempotency, I suggest using one of the more
> sophisticated tools, such as puppet, chef, salt, cfengine.. etc.

Yes agree, I am not asking for Heat to do magic that others can do better.
I am really just concerned with providing intuitive and consistent behavior
to people who just want to use their existing scripts.

>
<snip>
> > #3.3 connectivity of instances to heat engine API:
> > The current metadata and signaling framework has certain dependencies
on
> > connectivity from VMs to the Heat engine API. With some network setups,
and
> > in some customer environments we hit limitations of access from VMs to
the
> > management server. What can be done to enable additional network
setups?
> >
>
> I think our friends in Marconi, Sahara and Trove will be interested in
> this discussion. I have some ideas, but no time to drive them, and that
> is a bit out of scope for this email.

Great, I'm interested in this discussion. Maybe something to have at the
summit.

>
<snip>
> > #3.6 handling of stack updates for software config:
> > Stack updates are not cleanly supported with the initial software
> > orchestration implementation. #1 above could address this issue, but do
we
> > have to do something in addition?
> >
>
> Sorry I'm not sure I follow. Updates work fine today. What isn't clean
> about them?

This relates to what was discussed above. SoftwareConfig must allow a user
to provide hooks for the suspend and resume event, and the in-instance tool
must handle it. Today, there is not really a clear way for a template
author to express this in the template. I.e. no well-defined
'suspend_config' hook (see also comments earlier).
Implementation-wise, I think there might not be too much to be done here.

>
> > #3.7 stack-abandon and stack-adopt for software-config:
> > Issues have been found for stack-abandon and stack-adopt with software
> > configs that need to be addressed. Can this be handled by additional
hooks
> > as lined out under #1?
> >
>
> As defined, abandon and adopt are supposed to be "hands-off" operations.
> They're for mucking with Heat's internal model when it is inconsistent
> with reality. So I'm not sure we want to add _anything_ to the template
> language that acknowledges they even exist, or they'll be denied their
> super powers that allow them to operate in the shadows.

I talked to nanjj who found issues with abandon/adopt scenarios that
included software config. To be honest, I have to rely on him to provide
the details. I just did the collection of issues :-/

>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>




More information about the OpenStack-dev mailing list