[openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service

Vishvananda Ishaya vishvananda at gmail.com
Wed Jan 29 03:39:48 UTC 2014


On Jan 28, 2014, at 12:17 PM, Justin Santa Barbara <justin at fathomdb.com> wrote:

> Thanks John - combining with the existing effort seems like the right
> thing to do (I've reached out to Claxton to coordinate).  Great to see
> that the larger issues around quotas / write-once have already been
> agreed.
> 
> So I propose that sharing will work in the same way, but some values
> are visible across all instances in the project.  I do not think it
> would be appropriate for all entries to be shared this way.  A few
> options:
> 
> 1) A separate endpoint for shared values
> 2) Keys are shared iff  e.g. they start with a prefix, like 'peers_XXX'
> 3) Keys are set the same way, but a 'shared' parameter can be passed,
> either as a query parameter or in the JSON.
> 
> I like option #3 the best, but feedback is welcome.
> 
> I think I will have to store the value using a system_metadata entry
> per shared key.  I think this avoids issues with concurrent writes,
> and also makes it easier to have more advanced sharing policies (e.g.
> when we have hierarchical projects)
> 
> Thank you to everyone for helping me get to what IMHO is a much better
> solution than the one I started with!
> 
> Justin

I am -1 on the post data. I think we should avoid using the metadata service
as a cheap queue for communicating across vms and this moves strongly in
that direction.

I am +1 on providing a list of ip addresses in the current security group(s)
via metadata. I like limiting by security group instead of project because
this could prevent the 1000 instance case where people have large shared
tenants and it also provides a single tenant a way to have multiple autodiscoverd
services. Also the security group info is something that neutron has access
to so the neutron proxy should be able to generate the necessary info if
neutron is in use.

Just as an interesting side note, we put this vm list in way back in the NASA
days as an easy way to get mpi clusters running. In this case we grouped the
instances by the key_name used to launch the instance instead of security group.
I don’t think it occurred to us to use security groups at the time.  Note we
also provided the number of cores, but this was for convienience because the
mpi implementation didn’t support discovering number of cores. Code below.

Vish

$ git show 2cf40bb3
commit 2cf40bb3b21d33f4025f80d175a4c2ec7a2f8414
Author: Vishvananda Ishaya <vishvananda at yahoo.com>
Date:   Thu Jun 24 04:11:54 2010 +0100

    Adding mpi data

diff --git a/nova/endpoint/cloud.py b/nova/endpoint/cloud.py
index 8046d42..74da0ee 100644
--- a/nova/endpoint/cloud.py
+++ b/nova/endpoint/cloud.py
@@ -95,8 +95,21 @@ class CloudController(object):
     def get_instance_by_ip(self, ip):
         return self.instdir.by_ip(ip)

+    def _get_mpi_data(self, project_id):
+        result = {}
+        for node_name, node in self.instances.iteritems():
+            for instance in node.values():
+                if instance['project_id'] == project_id:
+                    line = '%s slots=%d' % (instance['private_dns_name'], instance.get('vcpus', 0))
+                    if instance['key_name'] in result:
+                        result[instance['key_name']].append(line)
+                    else:
+                        result[instance['key_name']] = [line]
+        return result
+
     def get_metadata(self, ip):
         i = self.get_instance_by_ip(ip)
+        mpi = self._get_mpi_data(i['project_id'])
         if i is None:
             return None
         if i['key_name']:
@@ -135,7 +148,8 @@ class CloudController(object):
                 'public-keys' : keys,
                 'ramdisk-id': i.get('ramdisk_id', ''),
                 'reservation-id': i['reservation_id'],
-                'security-groups': i.get('groups', '')
+                'security-groups': i.get('groups', ''),
+                'mpi': mpi
             }
         }
         if False: # TODO: store ancestor ids

> 
> 
> 
> 
> On Tue, Jan 28, 2014 at 4:38 AM, John Garbutt <john at johngarbutt.com> wrote:
>> On 27 January 2014 14:52, Justin Santa Barbara <justin at fathomdb.com> wrote:
>>> Day, Phil wrote:
>>> 
>>>> 
>>>>>> We already have a mechanism now where an instance can push metadata as
>>>>>> a way of Windows instances sharing their passwords - so maybe this
>>>>>> could
>>>>>> build on that somehow - for example each instance pushes the data its
>>>>>> willing to share with other instances owned by the same tenant ?
>>>>> 
>>>>> I do like that and think it would be very cool, but it is much more
>>>>> complex to
>>>>> implement I think.
>>>> 
>>>> I don't think its that complicated - just needs one extra attribute stored
>>>> per instance (for example into instance_system_metadata) which allows the
>>>> instance to be included in the list
>>> 
>>> 
>>> Ah - OK, I think I better understand what you're proposing, and I do like
>>> it.  The hardest bit of having the metadata store be full read/write would
>>> be defining what is and is not allowed (rate-limits, size-limits, etc).  I
>>> worry that you end up with a new key-value store, and with per-instance
>>> credentials.  That would be a separate discussion: this blueprint is trying
>>> to provide a focused replacement for multicast discovery for the cloud.
>>> 
>>> But: thank you for reminding me about the Windows password though...  It may
>>> provide a reasonable model:
>>> 
>>> We would have a new endpoint, say 'discovery'.  An instance can POST a
>>> single string value to the endpoint.  A GET on the endpoint will return any
>>> values posted by all instances in the same project.
>>> 
>>> One key only; name not publicly exposed ('discovery_datum'?); 255 bytes of
>>> value only.
>>> 
>>> I expect most instances will just post their IPs, but I expect other uses
>>> will be found.
>>> 
>>> If I provided a patch that worked in this way, would you/others be on-board?
>> 
>> I like that idea. Seems like a good compromise. I have added my review
>> comments to the blueprint.
>> 
>> We have this related blueprints going on, setting metadata on a
>> particular server, rather than a group:
>> https://blueprints.launchpad.net/nova/+spec/metadata-service-callbacks
>> 
>> It is limiting things using the existing Quota on metadata updates.
>> 
>> It would be good to agree a similar format between the two.
>> 
>> John
>> 
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 455 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140128/2926f58b/attachment.pgp>


More information about the OpenStack-dev mailing list