[openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service

Justin Santa Barbara justin at fathomdb.com
Fri Jan 24 20:05:38 UTC 2014

> Well if you're on a Neutron private network then you'd only be DDOS-ing
> yourself.
> In fact I think Neutron allows broadcast and multicast on private
> networks, and
> as nova-net is going to be deprecated at some point I wonder if this is
> reducing
> to a corner case ?

Neutron may well re-enable multicast/broadcast, but I think that (1)
multicast/broadcast is the wrong thing to use anyway, and more of an
artifact of the way clusters were previously deployed and (2) we should
have an option that doesn't require people to install Neutron with
multicast enabled.  I think that many public clouds, particularly those
that want to encourage an XaaS ecosystem, will avoid forcing people to use
Neutron's isolated networks.

> it seems that IP address would really be enough
> and the agents or whatever in the instance could take it from there ?

Quite possibly.  I'm very open to doing just that if people would prefer.

What worried me most, I think, is that if we make this part of the standard
> metadata then everyone would get it, and that raises a couple of concerns:
> - Users with lots of instances (say 1000's) but who weren't trying to run
> any form
> of discovery would start getting a lot more metadata returned, which might
> cause
> performance issues

The list of peers is only returned if the request comes in for peers.json,
so there's no growth in the returned data unless it is requested.  Because
of the very clear instructions in the comment to always pre-fetch data, it
is always pre-fetched, even though it would make more sense to me to fetch
it lazily when it was requested!  Easy to fix, but I'm obeying the comment
because it was phrased in the form of a grammatically valid sentence :-)

> - Some users might be running instances on behalf of customers (consider
> say a
> PaaS type service where the user gets access into an instance but not to
> the
> Nova API.   In that case I wouldn't want one instance to be able to
> discover these
> kinds of details about other instances.
Yes, I do think this is a valid concern.  But, there is likely to be _much_
more sensitive information in the metadata service, so anyone doing this is
hopefully blocking the metadata service anyway.  On EC2 with IAM, or if we
use trusts, there will be auth token in there.  And not just for security,
but also because if the PaaS program is auto-detecting EC2/OpenStack by
looking for the metadata service, that will cause the program to be very
confused if it sees the metadata for its host!

> So it kind of feels to me that this should be some other specific set of
> metadata
> that instances can ask for, and that instances have to explicitly opt into.

I think we have this in terms of the peers.json endpoint for byte-count
concerns.  For security, we only go per-project; I don't think we're
exposing any new information; and anyone doing multi-tenant should either
be using projects or be blocking 169.254 anyway.

We already have a mechanism now where an instance can push metadata as a
> way of Windows instances sharing their passwords - so maybe this could
> build
> on that somehow - for example each instance pushes the data its willing to
> share
> with other instances owned by the same tenant ?

I do like that and think it would be very cool, but it is much more complex
to implement I think.  It also starts to become a different problem: I do
think we need a state-store, like Swift or etcd or Zookeeper that is easily
accessibly to the instances.  Indeed, one of the things I'd like to build
using this blueprint is a distributed key-value store which would offer
that functionality.  But I think that having peer discovery is a much more
tightly defined blueprint, whereas some form of shared read-write
data-store is probably top-level project complexity.

> > On external agents doing the configuration: yes, they could put this
> into user
> > defined metadata, but then we're tied to a configuration system.  We have
> > to get 20 configuration systems to agree on a common format (Heat,
> Puppet,
> > Chef, Ansible, SaltStack, Vagrant, Fabric, all the home-grown systems!)
>  It
> > also makes it hard to launch instances concurrently (because you want
> node
> > #2 to have the metadata for node #1, so you have to wait for node #1 to
> get
> > an IP).
> >
> Well you've kind of got to agree on a common format anyway haven't you
> if the information is going to come from metadata ?   But I get your other
> points.

We do have to define a format, but because we only implement it once if we
do it at the Nova level I hope that there will be much more pragmatism than
if we had to get the configuration cabal to agree.  We can implement the
format, and if consumers want the functionality that's the format they must
parse :-)

>  I'd just like to
> see it separate from the existing metadata blob, and on an opt-in basis

Separate: is peers.json enough?  I'm not sure I'm understanding you here.

Opt-in:   IMHO, the danger of our OpenStack
everything-is-optional-and-configurable approach is that we end up in a
scenario where nothing is consistent and so nothing works "out of the box".
 I'd much rather hash-out an agreement about what is safe to share, even if
that is just IPs, and then get to the point where it is globally enabled.
 Would you be OK with it if it was just a list of IPs?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140124/45de0815/attachment.html>

More information about the OpenStack-dev mailing list