[openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service

Day, Phil philip.day at hp.com
Fri Jan 24 17:55:35 UTC 2014


> I haven't actually found where metadata caching is implemented, although the constructor of InstanceMetadata documents restrictions that really only make sense if it is.  Anyone know where it is cached?

Here's the code that does the caching:
https://github.com/openstack/nova/blob/master/nova/api/metadata/handler.py#L84-L98

Data is only cached for 15 seconds by default - the main reason for caching is that cloud-init makes a sequence of calls to get various items of metadata, and it saves a lot of DB access if we don't have to go back for them multiple times.

If your using the Openstack metadata calls instead then the caching doesn't buy much as it returns a single json blob with all the values.


From: Justin Santa Barbara [mailto:justin at fathomdb.com]
Sent: 24 January 2014 15:43
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service

Good points - thank you.  For arbitrary operations, I agree that it would be better to expose a token in the metadata service, rather than allowing the metadata service to expose unbounded amounts of API functionality.  We should therefore also have a per-instance token in the metadata, though I don't see Keystone getting the prerequisite IAM-level functionality for two+ releases (?).

However, I think I can justify peer discovery as the 'one exception'.  Here's why: discovery of peers is widely used for self-configuring clustered services, including those built in pre-cloud days.  Multicast/broadcast used to be the solution, but cloud broke that.  The cloud is supposed to be about distributed systems, yet we broke the primary way distributed systems do peer discovery. Today's workarounds are pretty terrible, e.g. uploading to an S3 bucket, or sharing EC2 credentials with the instance (tolerable now with IAM, but painful to configure).  We're not talking about allowing instances to program the architecture (e.g. attach volumes etc), but rather just to do the equivalent of a multicast for discovery.  In other words, we're restoring some functionality we took away (discovery via multicast) rather than adding programmable-infrastructure cloud functionality.

We expect the instances to start a gossip protocol to determine who is actually up/down, who else is in the cluster, etc.  As such, we don't need accurate information - we only have to help a node find one living peer.  (Multicast/broadcast was not entirely reliable either!)  Further, instance #2 will contact instance #1, so it doesn't matter if instance #1 doesn't have instance #2 in the list, as long as instance #2 sees instance #1.  I'm relying on the idea that instance launching takes time > 0, so other instances will be in the starting state when the metadata request comes in, even if we launch instances simultaneously.  (Another reason why I don't filter instances by state!)

I haven't actually found where metadata caching is implemented, although the constructor of InstanceMetadata documents restrictions that really only make sense if it is.  Anyone know where it is cached?

In terms of information exposed: An alternative would be to try to connect to every IP in the subnet we are assigned; this blueprint can be seen as an optimization on that (to avoid DDOS-ing the public clouds).  So I've tried to expose only the information that enables directed scanning: availability zone, reservation id, security groups, network ids & labels & cidrs & IPs [example below].  A naive implementation will just try every peer; a smarter implementation might check the security groups to try to filter it, or the zone information to try to connect to nearby peers first.  Note that I don't expose e.g. the instance state: if you want to know whether a node is up, you have to try connecting to it.  I don't believe any of this information is at all sensitive, particularly not to instances in the same project.

On external agents doing the configuration: yes, they could put this into user defined metadata, but then we're tied to a configuration system.  We have to get 20 configuration systems to agree on a common format (Heat, Puppet, Chef, Ansible, SaltStack, Vagrant, Fabric, all the home-grown systems!)  It also makes it hard to launch instances concurrently (because you want node #2 to have the metadata for node #1, so you have to wait for node #1 to get an IP).

More generally though, I have in mind a different model, which I call 'configuration from within' (as in 'truth comes from within'). I don't want a big imperialistic configuration system that comes and enforces its view of the world onto primitive machines.  I want a smart machine that comes into existence, discovers other machines and cooperates with them.  This is the Netflix pre-baked AMI concept, rather than the configuration management approach.

The blueprint does not exclude 'imperialistic' configuration systems, but it does enable e.g. just launching N instances in one API call, or just using an auto-scaling group.  I suspect the configuration management systems would prefer this to having to implement this themselves.

(Example JSON below)

Justin

---

Example JSON:

[
    {
        "availability_zone": "nova",
        "network_info": [
            {
                "id": "e60bbbaf-1d2e-474e-bbd2-864db7205b60",
                "network": {
                    "id": "f2940cd1-f382-4163-a18f-c8f937c99157",
                    "label": "private",
                    "subnets": [
                        {
                            "cidr": "10.11.12.0/24<http://10.11.12.0/24>",
                            "ips": [
                                {
                                    "address": "10.11.12.4",
                                    "type": "fixed",
                                    "version": 4
                                }
                            ],
                            "version": 4
                        },
                        {
                            "cidr": null,
                            "ips": [],
                            "version": null
                        }
                    ]
                }
            }
        ],
        "reservation_id": "r-44li8lxt",
        "security_groups": [
            {
                "name": "default"
            }
        ],
        "uuid": "2adcdda2-561b-494b-a8f6-378b07ac47a4"
    },

... (the above is repeated for every instance)...
]




On Fri, Jan 24, 2014 at 8:43 AM, Day, Phil <philip.day at hp.com<mailto:philip.day at hp.com>> wrote:
> Hi Justin,
>
>
>
> I can see the value of this, but I'm a bit wary of the metadata service
> extending into a general API - for example I can see this extending into a
> debate about what information needs to be made available about the instances
> (would you always want all instances exposed, all details, etc) - if not
> we'd end up starting to implement policy restrictions in the metadata
> service and starting to replicate parts of the API itself.
>
>
>
> Just seeing instances launched before me doesn't really help if they've been
> deleted (but are still in the cached values) does it ?
>
>
>
> Since there is some external agent creating these instances, why can't that
> just provide the details directly as user defined metadata ?
>
>
>
> Phil
>
>
>
> From: Justin Santa Barbara [mailto:justin at fathomdb.com<mailto:justin at fathomdb.com>]
> Sent: 23 January 2014 16:29
> To: OpenStack Development Mailing List
> Subject: [openstack-dev] [Nova] bp proposal: discovery of peer instances
> through metadata service
>
>
>
> Would appreciate feedback / opinions on this blueprint:
> https://blueprints.launchpad.net/nova/+spec/first-discover-your-peers
>
>
>
> The idea is: clustered services typically run some sort of gossip protocol,
> but need to find (just) one peer to connect to.  In the physical
> environment, this was done using multicast.  On the cloud, that isn't a
> great solution.  Instead, I propose exposing a list of instances in the same
> project, through the metadata service.
>
>
>
> In particular, I'd like to know if anyone has other use cases for instance
> discovery.  For peer-discovery, we can cache the instance list for the
> lifetime of the instance, because it suffices merely to see instances that
> were launched "before me".  (peer1 might not join to peer2, but peer2 will
> join to peer1).  Other use cases are likely much less forgiving!
>
>
> Justin
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140124/36f8d0b3/attachment.html>


More information about the OpenStack-dev mailing list