[openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service

Clint Byrum clint at fewbar.com
Fri Jan 24 18:36:08 UTC 2014

Excerpts from Justin Santa Barbara's message of 2014-01-24 07:43:23 -0800:
> Good points - thank you.  For arbitrary operations, I agree that it would
> be better to expose a token in the metadata service, rather than allowing
> the metadata service to expose unbounded amounts of API functionality.  We
> should therefore also have a per-instance token in the metadata, though I
> don't see Keystone getting the prerequisite IAM-level functionality for
> two+ releases (?).

Heat has been working hard to be able to do per-instance limited access
in Keystone for a while. A trust might work just fine for what you want.

> However, I think I can justify peer discovery as the 'one exception'.
>  Here's why: discovery of peers is widely used for self-configuring
> clustered services, including those built in pre-cloud days.
>  Multicast/broadcast used to be the solution, but cloud broke that.  The
> cloud is supposed to be about distributed systems, yet we broke the primary
> way distributed systems do peer discovery. Today's workarounds are pretty
> terrible, e.g. uploading to an S3 bucket, or sharing EC2 credentials with
> the instance (tolerable now with IAM, but painful to configure).  We're not
> talking about allowing instances to program the architecture (e.g. attach
> volumes etc), but rather just to do the equivalent of a multicast for
> discovery.  In other words, we're restoring some functionality we took away
> (discovery via multicast) rather than adding programmable-infrastructure
> cloud functionality.

Are you hesitant to just use Heat? This is exactly what it is supposed
to do.. make a bunch of API calls and expose the results to instances
for use in configuration.

If you're just hesitant to use a declarative templating language, I
totally understand. The auto-scaling minded people are also feeling
this way. You could join them in the quest to create an imperative
cluster-making API for Heat.

> We expect the instances to start a gossip protocol to determine who is
> actually up/down, who else is in the cluster, etc.  As such, we don't need
> accurate information - we only have to help a node find one living peer.
>  (Multicast/broadcast was not entirely reliable either!)  Further, instance
> #2 will contact instance #1, so it doesn’t matter if instance #1 doesn’t
> have instance #2 in the list, as long as instance #2 sees instance #1.  I'm
> relying on the idea that instance launching takes time > 0, so other
> instances will be in the starting state when the metadata request comes in,
> even if we launch instances simultaneously.  (Another reason why I don't
> filter instances by state!)
> I haven't actually found where metadata caching is implemented, although
> the constructor of InstanceMetadata documents restrictions that really only
> make sense if it is.  Anyone know where it is cached?
> In terms of information exposed: An alternative would be to try to connect
> to every IP in the subnet we are assigned; this blueprint can be seen as an
> optimization on that (to avoid DDOS-ing the public clouds).  So I’ve tried
> to expose only the information that enables directed scanning: availability
> zone, reservation id, security groups, network ids & labels & cidrs & IPs
> [example below].  A naive implementation will just try every peer; a
> smarter implementation might check the security groups to try to filter it,
> or the zone information to try to connect to nearby peers first.  Note that
> I don’t expose e.g. the instance state: if you want to know whether a node
> is up, you have to try connecting to it.  I don't believe any of this
> information is at all sensitive, particularly not to instances in the same
> project.
> On external agents doing the configuration: yes, they could put this into
> user defined metadata, but then we're tied to a configuration system.  We
> have to get 20 configuration systems to agree on a common format (Heat,
> Puppet, Chef, Ansible, SaltStack, Vagrant, Fabric, all the home-grown
> systems!)  It also makes it hard to launch instances concurrently (because
> you want node #2 to have the metadata for node #1, so you have to wait for
> node #1 to get an IP).
> More generally though, I have in mind a different model, which I call
> 'configuration from within' (as in 'truth comes from within'). I don’t want
> a big imperialistic configuration system that comes and enforces its view
> of the world onto primitive machines.  I want a smart machine that comes
> into existence, discovers other machines and cooperates with them.  This is
> the Netflix pre-baked AMI concept, rather than the configuration management
> approach.

:) We are on the same page. I really think Heat is where higher level
information sharing of this type belongs. I do think it might make sense
for Heat to push things into user-data post-boot, rather than only expose
them via its own metadata service. However, even without that, you can
achieve what you're talking about right now with Heat's separate metadata.

> The blueprint does not exclude 'imperialistic' configuration systems, but
> it does enable e.g. just launching N instances in one API call, or just
> using an auto-scaling group.  I suspect the configuration management
> systems would prefer this to having to implement this themselves.

N instances in one API call is something Heat does well, and it does
auto scaling too, so I feel like your idea is mostly just asking for a
simpler way to use Heat, which I think everyone would agree would be
good for all Heat users. :)

More information about the OpenStack-dev mailing list