[openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
Vishvananda Ishaya
vishvananda at gmail.com
Wed Jan 29 15:56:20 UTC 2014
On Jan 29, 2014, at 5:26 AM, Justin Santa Barbara <justin at fathomdb.com> wrote:
> Certainly my original inclination (and code!) was to agree with you Vish, but:
>
> 1) It looks like we're going to have writable metadata anyway, for
> communication from the instance to the API.
> 2) I believe the restrictions make it impractical to abuse it as a
> message-bus: size-limits, quotas and write-once make it very poorly
> suited for anything queue like.
> 3) Anything that isn't opt-in will likely have security implications
> which means that it won't get deployed. This must be deployed to be
> useful.
Fair enough. I agree that there are significant enough security implications
to skip the simple version.
Vish
>
> In short: I agree that it's not the absolute ideal solution (for me,
> that would be no opt-in), but it feels like the best solution given
> that we must have opt-in, or else e.g. HP won't deploy it. It uses a
> (soon to be) existing mechanism, and is readily extensible without
> breaking APIs.
>
> On your idea of scoping by security group, I believe a certain someone
> is looking at supporting hierarchical projects, so we will likely need
> to support more advanced logic here later anyway. For example: the
> ability to specify whether an entry should be shared with instances in
> child projects. This will likely take the form of a sort of selector
> language, so I anticipate we could offer a filter on security groups
> as well if this is useful. We might well also allow selection by
> instance tags. The approach allows this, though I would like to keep
> it as simple as possible at first (share with other instances in
> project or don't share)
>
> Justin
>
>
> On Tue, Jan 28, 2014 at 10:39 PM, Vishvananda Ishaya
> <vishvananda at gmail.com> wrote:
>>
>> On Jan 28, 2014, at 12:17 PM, Justin Santa Barbara <justin at fathomdb.com> wrote:
>>
>>> Thanks John - combining with the existing effort seems like the right
>>> thing to do (I've reached out to Claxton to coordinate). Great to see
>>> that the larger issues around quotas / write-once have already been
>>> agreed.
>>>
>>> So I propose that sharing will work in the same way, but some values
>>> are visible across all instances in the project. I do not think it
>>> would be appropriate for all entries to be shared this way. A few
>>> options:
>>>
>>> 1) A separate endpoint for shared values
>>> 2) Keys are shared iff e.g. they start with a prefix, like 'peers_XXX'
>>> 3) Keys are set the same way, but a 'shared' parameter can be passed,
>>> either as a query parameter or in the JSON.
>>>
>>> I like option #3 the best, but feedback is welcome.
>>>
>>> I think I will have to store the value using a system_metadata entry
>>> per shared key. I think this avoids issues with concurrent writes,
>>> and also makes it easier to have more advanced sharing policies (e.g.
>>> when we have hierarchical projects)
>>>
>>> Thank you to everyone for helping me get to what IMHO is a much better
>>> solution than the one I started with!
>>>
>>> Justin
>>
>> I am -1 on the post data. I think we should avoid using the metadata service
>> as a cheap queue for communicating across vms and this moves strongly in
>> that direction.
>>
>> I am +1 on providing a list of ip addresses in the current security group(s)
>> via metadata. I like limiting by security group instead of project because
>> this could prevent the 1000 instance case where people have large shared
>> tenants and it also provides a single tenant a way to have multiple autodiscoverd
>> services. Also the security group info is something that neutron has access
>> to so the neutron proxy should be able to generate the necessary info if
>> neutron is in use.
>>
>> Just as an interesting side note, we put this vm list in way back in the NASA
>> days as an easy way to get mpi clusters running. In this case we grouped the
>> instances by the key_name used to launch the instance instead of security group.
>> I don't think it occurred to us to use security groups at the time. Note we
>> also provided the number of cores, but this was for convienience because the
>> mpi implementation didn't support discovering number of cores. Code below.
>>
>> Vish
>>
>> $ git show 2cf40bb3
>> commit 2cf40bb3b21d33f4025f80d175a4c2ec7a2f8414
>> Author: Vishvananda Ishaya <vishvananda at yahoo.com>
>> Date: Thu Jun 24 04:11:54 2010 +0100
>>
>> Adding mpi data
>>
>> diff --git a/nova/endpoint/cloud.py b/nova/endpoint/cloud.py
>> index 8046d42..74da0ee 100644
>> --- a/nova/endpoint/cloud.py
>> +++ b/nova/endpoint/cloud.py
>> @@ -95,8 +95,21 @@ class CloudController(object):
>> def get_instance_by_ip(self, ip):
>> return self.instdir.by_ip(ip)
>>
>> + def _get_mpi_data(self, project_id):
>> + result = {}
>> + for node_name, node in self.instances.iteritems():
>> + for instance in node.values():
>> + if instance['project_id'] == project_id:
>> + line = '%s slots=%d' % (instance['private_dns_name'], instance.get('vcpus', 0))
>> + if instance['key_name'] in result:
>> + result[instance['key_name']].append(line)
>> + else:
>> + result[instance['key_name']] = [line]
>> + return result
>> +
>> def get_metadata(self, ip):
>> i = self.get_instance_by_ip(ip)
>> + mpi = self._get_mpi_data(i['project_id'])
>> if i is None:
>> return None
>> if i['key_name']:
>> @@ -135,7 +148,8 @@ class CloudController(object):
>> 'public-keys' : keys,
>> 'ramdisk-id': i.get('ramdisk_id', ''),
>> 'reservation-id': i['reservation_id'],
>> - 'security-groups': i.get('groups', '')
>> + 'security-groups': i.get('groups', ''),
>> + 'mpi': mpi
>> }
>> }
>> if False: # TODO: store ancestor ids
>>
>>>
>>>
>>>
>>>
>>> On Tue, Jan 28, 2014 at 4:38 AM, John Garbutt <john at johngarbutt.com> wrote:
>>>> On 27 January 2014 14:52, Justin Santa Barbara <justin at fathomdb.com> wrote:
>>>>> Day, Phil wrote:
>>>>>
>>>>>>
>>>>>>>> We already have a mechanism now where an instance can push metadata as
>>>>>>>> a way of Windows instances sharing their passwords - so maybe this
>>>>>>>> could
>>>>>>>> build on that somehow - for example each instance pushes the data its
>>>>>>>> willing to share with other instances owned by the same tenant ?
>>>>>>>
>>>>>>> I do like that and think it would be very cool, but it is much more
>>>>>>> complex to
>>>>>>> implement I think.
>>>>>>
>>>>>> I don't think its that complicated - just needs one extra attribute stored
>>>>>> per instance (for example into instance_system_metadata) which allows the
>>>>>> instance to be included in the list
>>>>>
>>>>>
>>>>> Ah - OK, I think I better understand what you're proposing, and I do like
>>>>> it. The hardest bit of having the metadata store be full read/write would
>>>>> be defining what is and is not allowed (rate-limits, size-limits, etc). I
>>>>> worry that you end up with a new key-value store, and with per-instance
>>>>> credentials. That would be a separate discussion: this blueprint is trying
>>>>> to provide a focused replacement for multicast discovery for the cloud.
>>>>>
>>>>> But: thank you for reminding me about the Windows password though... It may
>>>>> provide a reasonable model:
>>>>>
>>>>> We would have a new endpoint, say 'discovery'. An instance can POST a
>>>>> single string value to the endpoint. A GET on the endpoint will return any
>>>>> values posted by all instances in the same project.
>>>>>
>>>>> One key only; name not publicly exposed ('discovery_datum'?); 255 bytes of
>>>>> value only.
>>>>>
>>>>> I expect most instances will just post their IPs, but I expect other uses
>>>>> will be found.
>>>>>
>>>>> If I provided a patch that worked in this way, would you/others be on-board?
>>>>
>>>> I like that idea. Seems like a good compromise. I have added my review
>>>> comments to the blueprint.
>>>>
>>>> We have this related blueprints going on, setting metadata on a
>>>> particular server, rather than a group:
>>>> https://blueprints.launchpad.net/nova/+spec/metadata-service-callbacks
>>>>
>>>> It is limiting things using the existing Quota on metadata updates.
>>>>
>>>> It would be good to agree a similar format between the two.
>>>>
>>>> John
>>>>
>>>> _______________________________________________
>>>> OpenStack-dev mailing list
>>>> OpenStack-dev at lists.openstack.org
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 455 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140129/f570dc6f/attachment.pgp>
More information about the OpenStack-dev
mailing list