<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 23/04/14 04:42, Thomas Spatzier
wrote:<br>
</div>
<blockquote
cite="mid:OF1BBC757F.B8A5636D-ONC1257CC2.0058B4CD-C1257CC2.005BC22B@de.ibm.com"
type="cite">
<pre wrap="">
Hi all,
following up on Zane's request from end of last week, I wanted to kick off
some discussion on the ML around a design summit session proposal titled "
Next steps for Heat Software Orchestration". I guess there will be things
that can be sorted out this way and others that can be refined so we can
have a productive session in Atlanta. I am basically copying the complete
contents of the session proposal below so we can iterate on various points.
If it turns out that we need to split off threads, we can do that at a
later point.
The session proposal itself is here:
<a class="moz-txt-link-freetext" href="http://summit.openstack.org/cfp/details/306">http://summit.openstack.org/cfp/details/306</a>
And here are the details:
With the Icehouse release, Heat includes implementation for software
orchestration (Kudos to Steve Baker and Jun Jie Nan) which enables clean
separation of any kind of software configuration from compute instances and
thus enables a great new set of features. The implementation for software
orchestration in Icehouse has probably been the major chunk of work to
achieve a first end-to-end flow for software configuration thru scripts,
Chef or Puppet, but there is more work to be done to enable Heat for more
software orchestration use cases beyond the current support.
Below are a couple of use cases, and more importantly, thoughts on design
options of how those use cases can be addressed.
#1 Enable software components for full lifecycle:
With the current design, "software components" defined thru SoftwareConfig
resources allow for only one config (e.g. one script) to be specified.
Typically, however, a software component has a lifecycle that is hard to
express in a single script. For example, software must be installed
(created), there should be support for suspend/resume handling, and it
should be possible to allow for deletion-logic. This is also in line with
the general Heat resource lifecycle.
By means of the optional 'actions' property of SoftwareConfig it is
possible today to specify at which lifecycle action of a SoftwareDeployment
resource the single config hook shall be executed at runtime. However, for
modeling complete handling of a software component, this would require a
number of separate SoftwareConfig and SoftwareDeployment resources to be
defined which makes a template more verbose than it would have to be.
As an optimization, SoftwareConfig could allow for providing several hooks
to address all default lifecycle operations that would then be triggered
thru the respective lifecycle actions of a SoftwareDeployment resource.
Resulting SoftwareConfig definitions could then look like the one outlined
below. I think this would fit nicely into the overall Heat resource model
for actions beyond stack-create (suspend, resume, delete). Furthermore,
this will also enable a closer alignment and straight-forward mapping to
the TOSCA Simple Profile YAML work done at OASIS and the heat-translator
StackForge project.
So in a short, stripped-down version, SoftwareConfigs could look like
my_sw_config:
type: OS::Heat::SoftwareConfig
properties:
create_config: # the hook for software install
suspend_config: # hook for suspend action
resume_config: # hook for resume action
delete_config: # hook for delete action
When such a SoftwareConfig gets associated to a server via
SoftwareDeployment, the SoftwareDeployment resource lifecycle
implementation could trigger the respective hooks defined in SoftwareConfig
(if a hook is not defined, a no-op is performed). This way, all config
related to one piece of software is nicely defined in one place.
</pre>
</blockquote>
OS::Heat::SoftwareConfig itself needs to remain ignorant of heat
lifecycle phases, since it is just a store of config.<br>
<br>
Currently there are 2 ways to build configs which are lifecycle
aware:<br>
1. have a config/deployment pair, each with different deployment
actions<br>
2. have a single config/deployment, and have the config script do
conditional logic<br>
on the derived input value deploy_action<br>
<br>
Option 2. seem reasonable for most cases, but having an option which
maps better to TOSCA would be nice. <br>
<br>
Clint's StructuredConfig example would get us most of the way there,
but a dedicated config resource might be easier to use. The
deployment resource could remain agnostic to the contents of this
resource though. The right place to handle this on the deployment
side would be in the orc script 55-heat-config, which could infer
whether the config was a lifecycle config, then invoke the required
config based on the value of deploy_action.
<blockquote
cite="mid:OF1BBC757F.B8A5636D-ONC1257CC2.0058B4CD-C1257CC2.005BC22B@de.ibm.com"
type="cite">
<pre wrap="">
#2 Enable add-hoc actions on software components:
Apart from basic resource lifecycle hooks, it would be desirable to allow
for invocation of add-hoc actions on software. Examples would be the ad-hoc
creation of DB backups, application of patches, or creation of users for an
application. Such hooks (implemented as scripts, Chef recipes or Puppet
facts) could be defined in the same way as basic lifecycle hooks. They
could be triggered by doing property updates on the respective
SoftwareDeployment resources (just a thought and to be discussed during
design sessions).
I think this item could help bridging over to some discussions raised by
the Murano team recently (my interpretation: being able to trigger actions
from workflows). It would add a small feature on top of the current
software orchestration in Heat and keep definitions in one place. And it
would allow triggering by something or somebody else (e.g. a workflow)
probably using existing APIs.
</pre>
</blockquote>
Lets park this for now. Maybe one day heat templates will be used to
represent workflow tasks, but this isn't directly related to
software config.<br>
<blockquote
cite="mid:OF1BBC757F.B8A5636D-ONC1257CC2.0058B4CD-C1257CC2.005BC22B@de.ibm.com"
type="cite">
<pre wrap="">
#3 address known limitations of Heat software orchestration
As of today, there already are a couple of know limitations or points where
we have seen the need for additional discussion and design work. Below is a
collection of such issues.
Maybe some are already being worked on; others need more discussion.
#3.1 software deployment should run just once:
A bug has been raised because with today's implementation it can happen
that SoftwareDeployments get executed multiple times. There has been some
discussion around this issue but no final conclusion. An average user will
however assume that his automation gets run only or exactly once. When
using existing scripts, it would be an additional burden to require
rewrites to cope with multiple invocations. Therefore, we should have a
generic solution to the problem so that users do not have to deal with this
complex problem.
</pre>
</blockquote>
I'm with Clint on this one. Heat-engine cannot know the true state
of a server just by monitoring what has been polled and signaled.
Since it can't know it would be dangerous for it to guess. Instead
it should just offer all known configuration data to the server and
allow the server to make the decision whether to execute a config
again. I still think one more derived input value would be useful to
help the server to make that decision. This could either be a
datestamp for when the derived config was created, or a hash of all
of the derived config data.<br>
<br>
<blockquote
cite="mid:OF1BBC757F.B8A5636D-ONC1257CC2.0058B4CD-C1257CC2.005BC22B@de.ibm.com"
type="cite">
<pre wrap="">
#3.2 dependency on heat-cfn-api:
Some parts of current signaling still depend on the heat-cfn-api. While
work seems underway to completely move to Heat native signaling, some
cleanup to make sure this is used throughout the code.
</pre>
</blockquote>
This is possible for signaling now, by setting signal_transport:
HEAT_SIGNAL on the deployment resource.<br>
<br>
Polling will be possible once this os-collect-config change lands
and is in a release:<br>
<a class="moz-txt-link-freetext" href="https://review.openstack.org/#/c/84269/">https://review.openstack.org/#/c/84269/</a><br>
Native polling is enabled by setting the server resource property
<meta http-equiv="content-type" content="text/html;
charset=ISO-8859-1">
software_config_transport: POLL_SERVER_HEAT<br>
<br>
<blockquote
cite="mid:OF1BBC757F.B8A5636D-ONC1257CC2.0058B4CD-C1257CC2.005BC22B@de.ibm.com"
type="cite">
<pre wrap="">
#3.3 connectivity of instances to heat engine API:
The current metadata and signaling framework has certain dependencies on
connectivity from VMs to the Heat engine API. With some network setups, and
in some customer environments we hit limitations of access from VMs to the
management server. What can be done to enable additional network setups?
</pre>
</blockquote>
Some users want to run their servers in isolated neutron networks,
which means no polling or signaling to heat. To kick off the process
of finding a solution to this I proposed the following nova
blueprint:<br>
<a class="moz-txt-link-freetext" href="https://review.openstack.org/#/c/88703/">https://review.openstack.org/#/c/88703/</a><br>
The nova design session for this didn't make the cut, so I'm keen to
organize an ad-hoc session with anybody who is interested in this.
This deserves its own session since there are other non-heat
stakeholders who might like this too.<br>
<br>
<blockquote
cite="mid:OF1BBC757F.B8A5636D-ONC1257CC2.0058B4CD-C1257CC2.005BC22B@de.ibm.com"
type="cite">
<pre wrap="">
#3.4 number of created keystone users for deployments:
It has been pointed out that a large number of keystone users get created
for deployment and concerns have been raised that this could be a problem
for large deployments.
</pre>
</blockquote>
I've not seen any evidence yet that the overhead of a user is
significant compared to the overhead of a nova server, or a heat
stack - as long as the heat keystone domain is backed by a keystone
db.<br>
<blockquote
cite="mid:OF1BBC757F.B8A5636D-ONC1257CC2.0058B4CD-C1257CC2.005BC22B@de.ibm.com"
type="cite">
<pre wrap="">
#3.5 support of server groups:
How can a clean model look like where software configs get deployed on
server groups instead of single servers. What is the recommended modeling
and semantics?
</pre>
</blockquote>
For scaling groups, the deployment resource should be with the
server resource in the scaling template, and the config resource can
be defined anywhere and shared (but might be best defined at the
same level as the scaling group resource)<br>
<blockquote
cite="mid:OF1BBC757F.B8A5636D-ONC1257CC2.0058B4CD-C1257CC2.005BC22B@de.ibm.com"
type="cite">
<pre wrap="">
#3.6 handling of stack updates for software config:
Stack updates are not cleanly supported with the initial software
orchestration implementation. #1 above could address this issue, but do we
have to do something in addition?
</pre>
</blockquote>
Updates should work fine currently, but authors may prefer to
represent update workloads with a lifecycle config described in 1.<br>
<br>
However we still have the issue where if a server does a reboot or
rebuild on a stack update, since a nova reboot or rebuild does not
map to a heat lifecycle phase. This means we can't attach a
deployment resource to the shutdown action, so we can't trigger
quiescing config during reboots or rebuilds (Note that quiescing
during a server DELETE should work). Clint had a look at this a
while back, we'll need to pick it up at some point.<br>
<br>
<blockquote
cite="mid:OF1BBC757F.B8A5636D-ONC1257CC2.0058B4CD-C1257CC2.005BC22B@de.ibm.com"
type="cite">
<pre wrap="">
#3.7 stack-abandon and stack-adopt for software-config:
Issues have been found for stack-abandon and stack-adopt with software
configs that need to be addressed. Can this be handled by additional hooks
as lined out under #1?
</pre>
</blockquote>
There is a problem with the way abandon and adopt are currently
implemented. Servers will continue to poll for metadata from the
abandoning heat, using abandoned credentials. There needs to be an
added phase in the abandon/adopt process where the metadata returned
from abandoning heat will return the endpoints and credentials for
the adopting heat so that the server can start polling for valid
metadata again.<br>
<br>
This is more of an abandon/adopt issue than a software-config one ;)
Maybe we can figure out the solution on the beer-track.<br>
<br>
<br>
For this design session I have my own list of items to discuss:<br>
#4.1 Maturing the puppet hook so it can invoke more existing puppet
scripts<br>
#4.2 Make progress on the chef hook, and defining the mapping from
chef concepts to heat config/inputs/outputs<br>
#4.3 Finding volunteers to write hooks for Salt, Ansible<br>
#5.1 Now that heatclient can include binary files, discuss enhancing
get_file to zip the directory contents if it is pointed at a
directory<br>
#5.2 Now that heatclient can include binary files, discuss making
stack create/update API calls multipart/form-data so that proper
mime data can be captured for attached files<br>
#6.1 Discuss options for where else metadata could be polled from
(ie, swift)<br>
#6.2 Discuss whether #6.1 can lead to software-config that can work
on an OpenStack which doesn't allow admin users or keystone domains
(ie, rackspace)<br>
<br>
</body>
</html>