[Openstack] Instance IDs and Multiple Zones
eday at oddments.org
Wed Mar 23 17:49:22 UTC 2011
On Wed, Mar 23, 2011 at 08:15:54AM -0400, Ed Leafe wrote:
> On Mar 23, 2011, at 1:55 AM, Eric Day wrote:
> > If we provide some structure to the IDs, such as DNS names, we not only
> > solve this namespacing problem but we also get a much more efficient
> > routing mechanism.
> When I read things like this, the DBA in me winces a little. Meaningful PKs, compound PKs - they always end up being a Very Bad Thing. If you want to add efficient DNS routing, that could be added as additional data about an instance that is periodically updated up the zone structure along with the other capability information, but until now we've passed on that as a premature optimization. That was one of the major arguments in favor of the global DB design.
We're talking about a number of partitioning schemes, reserved bits,
URNs, URIs, etc. Because of the namespace issue I believe we will
need some structure to our resource names.
> > Lets say you have api.rackspace.com (global aggregation zone),
> > rack1.dfw.rackspace.com (real zone running instances), and
> > bursty.customer.com (private zone). Bursty is a rackspace customer
> > and they want to leverage their private resources alongside the
> > public cloud, so they add bursty.customer.com as a private zone
> > for their Rackspace account. The api.rackspace.com server now gets
> > a terminate request for <id x> and it needs to know where to route
> > the request. If we have a global namespace for instances (such as
> > UUIDs), rack1.dfw.rackspace.com and bursty.customer.com could both
> > have servers for <id x> (most likely from bursty spoofing the ID). Now
> > api.rackspace.com doesn't know who to forward the request to.
> Even if this scenario were to happen, and nova tried to delete an instance with a spoofed ID that did *not* belong to Bursty, it would fail due to improper auth. Otherwise, even without zones/uuids/whatever, I could send termination requests to the API with random IDs and delete any machines with those IDs, whether I had rights to them or not.
This implies the resource is now uniquely identified along with auth
credentials, which means the resource name cannot stand alone. If
we do have collisions due to spoofing, we're going to see ambiguity
issues crop up in other systems that don't have the auth context. I
strongly believe we need unique resource names that stand on our own
and don't depend on any other component such as auth.
> In the current zone design, a request to terminate <id x> would not be handled by the outermost zone, since it wouldn't have instances, so it would be forward to each child zone. This would repeat down the zone hierarchy until either there were no more child zones, or a zone found that it had an instance with that ID. In the Bursty example, two zones would find an instance with that ID; one would fail due to auth, and the one owned by Bursty would be terminated as requested. The only way more than one instance would terminate would be if Bursty spoofed their own IDs, which would be their problem, not ours.
I think the "In the current zone design" is my main concern. This
discussions is taking into account how things need to work in the
near future, not just now. We've punted on routing for now and are
simply sending the request to every zone, but this won't work in the
long run. If we had a large public cloud with hundreds of zones,
and thousands of bursting zones, things will get prohibitively
expensive. It's not that they won't function, it just may be
unreasonable response time.
More information about the Openstack