[openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service

Day, Phil philip.day at hp.com
Mon Jan 27 11:46:58 UTC 2014

>> What worried me most, I think, is that if we make this part of the standard
>> metadata then everyone would get it, and that raises a couple of concerns:
>> - Users with lots of instances (say 1000's) but who weren't trying to run any
>> form of discovery would start getting a lot more metadata returned, which
>> might cause performance issues

> The list of peers is only returned if the request comes in for peers.json, so
> there's no growth in the returned data unless it is requested.  Because of the
> very clear instructions in the comment to always pre-fetch data, it is always
> pre-fetched, even though it would make more sense to me to fetch it lazily
> when it was requested!  Easy to fix, but I'm obeying the comment because it
> was phrased in the form of a grammatically valid sentence :-)
Ok, thanks for the clarification - I'd missed that this was a new json object, I thought you were just adding the data onto the existing object.

>> - Some users might be running instances on behalf of customers (consider
>> say a PaaS type service where the user gets access into an instance but not to
>> the Nova API.   In that case I wouldn't want one instance to be able to
>> discover these kinds of details about other instances.

> Yes, I do think this is a valid concern.  But, there is likely to be _much_ more
> sensitive information in the metadata service, so anyone doing this is
> hopefully blocking the metadata service anyway.  On EC2 with IAM, or if we
> use trusts, there will be auth token in there.  And not just for security, but
> also because if the PaaS program is auto-detecting EC2/OpenStack by looking
> for the metadata service, that will cause the program to be very confused if it
> sees the metadata for its host!

Currently the metadata service only returns information for the instance that is requesting it (the Neutron proxy validates the source address and project), so the concern around sensitive information is already mitigated.    But if we're now going to return information about other instances that changes the picture somewhat. 

>> We already have a mechanism now where an instance can push metadata as
>> a way of Windows instances sharing their passwords - so maybe this could
>> build on that somehow - for example each instance pushes the data its
>> willing to share with other instances owned by the same tenant ?
> I do like that and think it would be very cool, but it is much more complex to
> implement I think.

I don't think its that complicated - just needs one extra attribute stored per instance (for example into instance_system_metadata) which allows the instance to be included in the list

>  It also starts to become a different problem: I do think we
> need a state-store, like Swift or etcd or Zookeeper that is easily accessibly to
> the instances.  Indeed, one of the things I'd like to build using this blueprint is
> a distributed key-value store which would offer that functionality.  But I think
> that having peer discovery is a much more tightly defined blueprint, whereas
> some form of shared read-write data-store is probably top-level project
> complexity.
Isn't the metadata already in effect that state-store ? 

>>  I'd just like to
>> see it separate from the existing metadata blob, and on an opt-in basis
> Separate: is peers.json enough?  I'm not sure I'm understanding you here.
Yep - that ticks the separate box. 

> Opt-in:   IMHO, the danger of our OpenStack everything-is-optional-and-
> configurable approach is that we end up in a scenario where nothing is
> consistent and so nothing works "out of the box".  I'd much rather hash-out
> an agreement about what is safe to share, even if that is just IPs, and then
> get to the point where it is globally enabled.  Would you be OK with it if it was
> just a list of IPs?

I still think that would cause problems for PaaS services that abstracts the users away from direct control of the instance (I,e. the PaaS service is the Nova tenant, and creates instances in that tenant that are then made available to individual users.   At the moment the only data such a user can see even from metadata are details of their instance.     Extending that to allowing discover of other instances in the same tenant still feels to me to be something that needs to be controllable.       The number of instances that want / need to be able to discover each other is subset of all instances, so making those explicitly declare themselves to the metadata service (when they have to already have the logic to get peers.json) doesn't sound like a major additional complication to me.


More information about the OpenStack-dev mailing list