[openstack-dev] Unified Guest Agent proposal

Clint Byrum clint at fewbar.com
Mon Dec 9 18:19:31 UTC 2013

Excerpts from Steven Dake's message of 2013-12-09 09:41:06 -0800:
> On 12/09/2013 09:41 AM, David Boucha wrote:
> > On Sat, Dec 7, 2013 at 11:09 PM, Monty Taylor <mordred at inaugust.com 
> > <mailto:mordred at inaugust.com>> wrote:
> >
> >
> >
> >     On 12/08/2013 07:36 AM, Robert Collins wrote:
> >     > On 8 December 2013 17:23, Monty Taylor <mordred at inaugust.com
> >     <mailto:mordred at inaugust.com>> wrote:
> >     >>
> >     >
> >     >> I suggested salt because we could very easily make trove and
> >     savana into
> >     >> salt masters (if we wanted to) just by having them import salt
> >     library
> >     >> and run an api call. When they spin up nodes using heat, we
> >     could easily
> >     >> have that to the cert exchange - and the admins of the site
> >     need not
> >     >> know _anything_ about salt, puppet or chef - only about trove
> >     or savana.
> >     >
> >     > Are salt masters multi-master / HA safe?
> >     >
> >     > E.g. if I've deployed 5 savanna API servers to handle load, and they
> >     > all do this 'just import', does that work?
> >     >
> >     > If not, and we have to have one special one, what happens when it
> >     > fails / is redeployed?
> >
> >     Yes. You can have multiple salt masters.
> >
> >     > Can salt minions affect each other? Could one pretend to be a
> >     master,
> >     > or snoop requests/responses to another minion?
> >
> >     Yes and no. By default no - and this is protected by key
> >     encryption and
> >     whatnot. They can affect each other if you choose to explicitly grant
> >     them the ability to. That is - you can give a minion an acl to
> >     allow it
> >     inject specific command requests back up into the master. We use
> >     this in
> >     the infra systems to let a jenkins slave send a signal to our salt
> >     system to trigger a puppet run. That's all that slave can do though -
> >     send the signal that the puppet run needs to happen.
> >
> >     However - I don't think we'd really want to use that in this case,
> >     so I
> >     think they answer you're looking for is no.
> >
> >     > Is salt limited: is it possible to assert that we *cannot* run
> >     > arbitrary code over salt?
> >
> >     In as much as it is possible to assert that about any piece of
> >     software
> >     (bugs, of course, blah blah) But the messages that salt sends to a
> >     minion are "run this thing that you have a local definition for"
> >     rather
> >     than "here, have some python and run it"
> >
> >     Monty
> >
> >
> >
> > Salt was originally designed to be a unified agent for a system like 
> > openstack. In fact, many people use it for this purpose right now.
> >
> > I discussed this with our team management and this is something 
> > SaltStack wants to support.
> >
> > Are there any specifics things that the salt minion lacks right now to 
> > support this use case?
> >
> David,
> If I am correct of my parsing of the salt nomenclature, Salt provides a 
> Master (eg a server) and minions (eg agents that connect to the salt 
> server).  The salt server tells the minions what to do.
> This is not desirable for a unified agent (atleast in the case of Heat).
> The bar is very very very high for introducing new *mandatory* *server* 
> dependencies into OpenStack.  Requiring a salt master (or a puppet 
> master, etc) in my view is a non-starter for a unified guest agent 
> proposal.  Now if a heat user wants to use puppet, and can provide a 
> puppet master in their cloud environment, that is fine, as long as it is 
> optional.

What if we taught Heat to speak salt-master-ese? AFAIK it is basically
an RPC system. I think right now it is 0mq, so it would be relatively
straight forward to just have Heat start talking to the agents in 0mq.

> A guest agent should have the following properties:
> * minimal library dependency chain
> * no third-party server dependencies
> * packaged in relevant cloudy distributions

That last one only matters if the distributions won't add things like
agents to their images post-release. I am pretty sure "work well in
OpenStack" is important for server distributions and thus this is at
least something we don't have to freak out about too much.

> In terms of features:
> * run shell commands
> * install files (with selinux properties as well)
> * create users and groups (with selinux properties as well)
> * install packages via yum, apt-get, rpm, pypi
> * start and enable system services for systemd or sysvinit
> * Install and unpack source tarballs
> * run scripts
> * Allow grouping, selection, and ordering of all of the above operations

All of those things are general purpose low level system configuration
features. None of them will be needed for Trove or Savanna. They need
to do higher level things like run a Hadoop job or create a MySQL user.

> Agents are a huge pain to maintain and package.  It took a huge amount 
> of willpower to get cloud-init standardized across the various 
> distributions.  We have managed to get heat-cfntools (the heat agent) 
> into every distribution at this point and this was a significant amount 
> of work.  We don't want to keep repeating this process for each 
> OpenStack project!

cloud-init is special because it _must_ be in the image to be relevant.
It is the early-boot enabler of all other agents. The effort that happened
there shouldn't ever have to happen again as long as we are good about
always noting when something is an early-boot feature versus an ongoing
management feature.

Even with it packaged, we still can only use the lowest common denominator
among popular distributions until the new feature arrives there, so it is
really not a great idea to base anything forward-thinking on cloud-init
features. What I expect to generally happen is when cloud-init grows
a feature, whole swathes of specialized code will be deprecated and
replaced with small snippets of declarations for cloud-init. But the
old code will have to keep working until the old cloud-init dies.

What we're talking about is an agent that enables other workloads. Heat's
general case is different from Savanna and Trove's.  Both Savanna and
Trove build custom images and just spin those up precisely because it
is not efficient to use a general purpose distribution to do something
narrowly focused like host a database or map/reduce software.

However, there is enough overlap in what Heat wants to do that it makes
sense that we should try to use the same agent for Heat users if

The complication in this discussion I think comes from the fact that we
really have three distinct goals:

a. Enable communication in and out of private networks.
b. Enable high level control of in-instance software.
c. Light-weight general purpose post-boot customization.

I think for (a) we have neutron-metadata-agent which could be enhanced
to be more general instead of just forwarding EC2 metadata.

For (b) salt-minion with Trove/Savanna specific plugins would fit the
bill as long as we can teach Heat to speak salt-master. This does not
change the dependency graph since Trove and Savanna will already be
deferring infrastructure management to Heat.

For (c) I have always liked cfn-init for light-weight post-boot
customization and I think it should just be kept around and taught
to speak the protocol we validate in (b). However, if (b) is in fact
salt's protocol, it would make sense to just make it easy to manage
salt and let salt's default rules take over. Since we've also decided
to make it easy to make puppet and chef rules easy to specify in HOT,
this fits nicely with current plans.

More information about the OpenStack-dev mailing list