[openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service

Vishvananda Ishaya vishvananda at gmail.com
Wed Jan 29 15:56:20 UTC 2014


On Jan 29, 2014, at 5:26 AM, Justin Santa Barbara <justin at fathomdb.com> wrote:

> Certainly my original inclination (and code!) was to agree with you Vish, but:
> 
> 1) It looks like we're going to have writable metadata anyway, for
> communication from the instance to the API.
> 2) I believe the restrictions make it impractical to abuse it as a
> message-bus: size-limits, quotas and write-once make it very poorly
> suited for anything queue like.
> 3) Anything that isn't opt-in will likely have security implications
> which means that it won't get deployed.  This must be deployed to be
> useful.

Fair enough. I agree that there are significant enough security implications
to skip the simple version.

Vish

> 
> In short: I agree that it's not the absolute ideal solution (for me,
> that would be no opt-in), but it feels like the best solution given
> that we must have opt-in, or else e.g. HP won't deploy it.  It uses a
> (soon to be) existing mechanism, and is readily extensible without
> breaking APIs.
> 
> On your idea of scoping by security group, I believe a certain someone
> is looking at supporting hierarchical projects, so we will likely need
> to support more advanced logic here later anyway.  For example:  the
> ability to specify whether an entry should be shared with instances in
> child projects.  This will likely take the form of a sort of selector
> language, so I anticipate we could offer a filter on security groups
> as well if this is useful.  We might well also allow selection by
> instance tags.  The approach allows this, though I would like to keep
> it as simple as possible at first (share with other instances in
> project or don't share)
> 
> Justin
> 
> 
> On Tue, Jan 28, 2014 at 10:39 PM, Vishvananda Ishaya
> <vishvananda at gmail.com> wrote:
>> 
>> On Jan 28, 2014, at 12:17 PM, Justin Santa Barbara <justin at fathomdb.com> wrote:
>> 
>>> Thanks John - combining with the existing effort seems like the right
>>> thing to do (I've reached out to Claxton to coordinate).  Great to see
>>> that the larger issues around quotas / write-once have already been
>>> agreed.
>>> 
>>> So I propose that sharing will work in the same way, but some values
>>> are visible across all instances in the project.  I do not think it
>>> would be appropriate for all entries to be shared this way.  A few
>>> options:
>>> 
>>> 1) A separate endpoint for shared values
>>> 2) Keys are shared iff  e.g. they start with a prefix, like 'peers_XXX'
>>> 3) Keys are set the same way, but a 'shared' parameter can be passed,
>>> either as a query parameter or in the JSON.
>>> 
>>> I like option #3 the best, but feedback is welcome.
>>> 
>>> I think I will have to store the value using a system_metadata entry
>>> per shared key.  I think this avoids issues with concurrent writes,
>>> and also makes it easier to have more advanced sharing policies (e.g.
>>> when we have hierarchical projects)
>>> 
>>> Thank you to everyone for helping me get to what IMHO is a much better
>>> solution than the one I started with!
>>> 
>>> Justin
>> 
>> I am -1 on the post data. I think we should avoid using the metadata service
>> as a cheap queue for communicating across vms and this moves strongly in
>> that direction.
>> 
>> I am +1 on providing a list of ip addresses in the current security group(s)
>> via metadata. I like limiting by security group instead of project because
>> this could prevent the 1000 instance case where people have large shared
>> tenants and it also provides a single tenant a way to have multiple autodiscoverd
>> services. Also the security group info is something that neutron has access
>> to so the neutron proxy should be able to generate the necessary info if
>> neutron is in use.
>> 
>> Just as an interesting side note, we put this vm list in way back in the NASA
>> days as an easy way to get mpi clusters running. In this case we grouped the
>> instances by the key_name used to launch the instance instead of security group.
>> I don't think it occurred to us to use security groups at the time.  Note we
>> also provided the number of cores, but this was for convienience because the
>> mpi implementation didn't support discovering number of cores. Code below.
>> 
>> Vish
>> 
>> $ git show 2cf40bb3
>> commit 2cf40bb3b21d33f4025f80d175a4c2ec7a2f8414
>> Author: Vishvananda Ishaya <vishvananda at yahoo.com>
>> Date:   Thu Jun 24 04:11:54 2010 +0100
>> 
>>    Adding mpi data
>> 
>> diff --git a/nova/endpoint/cloud.py b/nova/endpoint/cloud.py
>> index 8046d42..74da0ee 100644
>> --- a/nova/endpoint/cloud.py
>> +++ b/nova/endpoint/cloud.py
>> @@ -95,8 +95,21 @@ class CloudController(object):
>>     def get_instance_by_ip(self, ip):
>>         return self.instdir.by_ip(ip)
>> 
>> +    def _get_mpi_data(self, project_id):
>> +        result = {}
>> +        for node_name, node in self.instances.iteritems():
>> +            for instance in node.values():
>> +                if instance['project_id'] == project_id:
>> +                    line = '%s slots=%d' % (instance['private_dns_name'], instance.get('vcpus', 0))
>> +                    if instance['key_name'] in result:
>> +                        result[instance['key_name']].append(line)
>> +                    else:
>> +                        result[instance['key_name']] = [line]
>> +        return result
>> +
>>     def get_metadata(self, ip):
>>         i = self.get_instance_by_ip(ip)
>> +        mpi = self._get_mpi_data(i['project_id'])
>>         if i is None:
>>             return None
>>         if i['key_name']:
>> @@ -135,7 +148,8 @@ class CloudController(object):
>>                 'public-keys' : keys,
>>                 'ramdisk-id': i.get('ramdisk_id', ''),
>>                 'reservation-id': i['reservation_id'],
>> -                'security-groups': i.get('groups', '')
>> +                'security-groups': i.get('groups', ''),
>> +                'mpi': mpi
>>             }
>>         }
>>         if False: # TODO: store ancestor ids
>> 
>>> 
>>> 
>>> 
>>> 
>>> On Tue, Jan 28, 2014 at 4:38 AM, John Garbutt <john at johngarbutt.com> wrote:
>>>> On 27 January 2014 14:52, Justin Santa Barbara <justin at fathomdb.com> wrote:
>>>>> Day, Phil wrote:
>>>>> 
>>>>>> 
>>>>>>>> We already have a mechanism now where an instance can push metadata as
>>>>>>>> a way of Windows instances sharing their passwords - so maybe this
>>>>>>>> could
>>>>>>>> build on that somehow - for example each instance pushes the data its
>>>>>>>> willing to share with other instances owned by the same tenant ?
>>>>>>> 
>>>>>>> I do like that and think it would be very cool, but it is much more
>>>>>>> complex to
>>>>>>> implement I think.
>>>>>> 
>>>>>> I don't think its that complicated - just needs one extra attribute stored
>>>>>> per instance (for example into instance_system_metadata) which allows the
>>>>>> instance to be included in the list
>>>>> 
>>>>> 
>>>>> Ah - OK, I think I better understand what you're proposing, and I do like
>>>>> it.  The hardest bit of having the metadata store be full read/write would
>>>>> be defining what is and is not allowed (rate-limits, size-limits, etc).  I
>>>>> worry that you end up with a new key-value store, and with per-instance
>>>>> credentials.  That would be a separate discussion: this blueprint is trying
>>>>> to provide a focused replacement for multicast discovery for the cloud.
>>>>> 
>>>>> But: thank you for reminding me about the Windows password though...  It may
>>>>> provide a reasonable model:
>>>>> 
>>>>> We would have a new endpoint, say 'discovery'.  An instance can POST a
>>>>> single string value to the endpoint.  A GET on the endpoint will return any
>>>>> values posted by all instances in the same project.
>>>>> 
>>>>> One key only; name not publicly exposed ('discovery_datum'?); 255 bytes of
>>>>> value only.
>>>>> 
>>>>> I expect most instances will just post their IPs, but I expect other uses
>>>>> will be found.
>>>>> 
>>>>> If I provided a patch that worked in this way, would you/others be on-board?
>>>> 
>>>> I like that idea. Seems like a good compromise. I have added my review
>>>> comments to the blueprint.
>>>> 
>>>> We have this related blueprints going on, setting metadata on a
>>>> particular server, rather than a group:
>>>> https://blueprints.launchpad.net/nova/+spec/metadata-service-callbacks
>>>> 
>>>> It is limiting things using the existing Quota on metadata updates.
>>>> 
>>>> It would be good to agree a similar format between the two.
>>>> 
>>>> John
>>>> 
>>>> _______________________________________________
>>>> OpenStack-dev mailing list
>>>> OpenStack-dev at lists.openstack.org
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> 
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> 
>> 
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> 
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 455 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140129/f570dc6f/attachment.pgp>


More information about the OpenStack-dev mailing list