[Openstack] Instance IDs and Multiple Zones

Brian Schott bfschott at gmail.com
Tue Mar 22 20:45:08 UTC 2011


I remember reading this a while ago.  Not saying we have to do this.  This is probably why zones are independent and ids are not unique across zones in EC2.  

This could be handled in the ec2 api service for compatibility.  We could just XOR the  top half and the bottom half of a UUID and get a unique hash that just the EC2 API needs to keep track of.  The only important thing is that the USER doesn't get id collisions.

---

http://www.jackofallclouds.com/2009/09/anatomy-of-an-amazon-ec2-resource-id/

Anatomy of a Resource ID

So how were the numbers above calculated? To find out, let’s decompose an EC2 resource ID. After comparing hundreds of IDs, this opaque identifier turned out to be a little more transparent than you’d expect.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: ec2_resource_id.png
Type: image/png
Size: 4950 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20110322/e4447b0e/attachment.png>
-------------- next part --------------

Type

The most trivial of the fields, the type is one of the following values, depending on the resource type:

	• i – instance
	• r – reservation
	• vol – EBS volume
	• snap – EBS snapshot
	• ami – Amazon machine image
	• aki – Amazon kernel image
	• ari – Amazon ramdisk image
Inner ID

The Inner ID is a 16-bit counter of resources allocated. Each time a resource is requested, the Inner ID increments by one. For instance and reservation IDs, it increments by two (i.e., these Inner IDs are always even). Instead of counting from 0-FFFF as you’d expect, the Inner ID uses the following cycle:

	• 4000-7FFF
	• 0000-3FFF
	• C000-FFFF
	• 8000-BFFF
(This cycle can be easily normalized by XORing with 4000.) When the Inner ID has exhausted its space, a new series begins (see below) and the cycle restarts.

Series Marker

For a given resource type, there is one active 8-bit Series ID. This Series ID, however, is not embedded directly into the resource ID. Instead, it is XORed to the leftmost 8 bits of the Inner ID. The result, which I call the Series Marker, is embedded in the ID to the left of the Inner ID.

For example, on the resource ID above the Series ID would be e5 = a7 XOR 42.

Series IDs usually decrement by one each time the Inner ID completes a cycle. I say “usually” because while this is the most common behavior, from time to time Series IDs seem to jump around in a pattern which is yet to be explained.

UPDATE (Oct 7th 2009): RightScale contributed the missing piece: to normalize a series ID, XOR with E5 – this irons out the “jumps” I noticed perfectly.

Superseries Marker

For a given resource type, there is one active 8-bit Superseries ID. Like the Series ID, the Superseries ID is not embedded directly into the resource ID. Instead, it is XORed to the rightmost 8 bits of the Inner ID. The result – the Superseries Marker – is the leftmost byte of the resource ID.

For example, on the resource ID above the Superseries ID would be 69 = 31 XOR 58.

The Superseries ID changes so rarely that originally I had assumed it was some kind of checksum. This would have been odd as it limits the total available IDs to 224 = 16.8 million. Up to very recently, the Superseries ID for all resource types – instances, images, volumes, snapshots, etc. – was 69 (in the us-east-1 region (for eu-west-1 the Superseries ID is 74). These days, new instances use the Superseries ID 68. This subtle change, unnoticed by the industry, may hint at an astonishing achievement: 8.4 million instances launched since EC2?s debut! (Instance IDs are even so 8.4M = 16.8M / 2.)

UPDATE (Oct 7th 2009): RightScale suggested to normalize the Superseries ID by XORing with 69. In this technique, the superseries ID for us-east-1 was 0, and the recent change incremented it to 1.

Brian Schott
bfschott at gmail.com



On Mar 22, 2011, at 3:44 PM, Vishvananda Ishaya wrote:

> The main issue that drove integers is backwards compatibility to the ec2_api and existing ec2 toolsets.  People seemed very opposed to the idea of having two separate ids in the database, one for ec2 and one for the underlying system.  If we want to move to another id scheme that doesn't fit in a 32 bit integer we have to provide a way for ec2 style ids to be assigned to instances, perhaps through a central authority that hands out unique ids.
> 
> Vish
> 
> On Mar 22, 2011, at 12:30 PM, Justin Santa Barbara wrote:
> 
>> The API spec doesn't seem to preclude us from doing a fully-synchronous method if we want to (it just reserves the option to do an async implementation).  Obviously we should make scheduling fast, but I think we're fine doing synchronous scheduling.  It's still probably going to be much faster than CloudServers on a bad day anyway :-)
>> 
>> Anyone have a link to where we chose to go with integer IDs?  I'd like to understand why, because presumably we had a good reason.
>> 
>> However, if we don't have documentation of the decision, then I vote that it never happened, and instance ids are strings.  We've always been at war with Eastasia, and all ids have always been strings.
>> 
>> Justin
>> 
>> 
>> 
>> 
>> On Tue, Mar 22, 2011 at 12:20 PM, Paul Voccio <paul.voccio at rackspace.com> wrote:
>> I agree with the sentiment that integers aren't the way to go long term.
>> The current spec of the api does introduce some interesting problems to
>> this discussion. All can be solved. The spec calls for the api to return
>> an id and a password upon instance creation. This means the api isn't
>> asynchronous if it has to wait for the zone to create the id. From page 46
>> of the API Spec states the following:
>> 
>> "Note that when creating a server only the server ID and the admin
>> password are guaranteed to be returned in the request object. Additional
>> attributes may be retrieved by performing subsequent GETs on the server."
>> 
>> 
>> 
>> This creates a problem with the bursting if Z1 calls to Z2, which is a
>> public cloud, which has to wait for Z3-X to find out where it is going be
>> placed. How would this work?
>> 
>> pvo
>> 
>> On 3/22/11 1:39 PM, "Chris Behrens" <chris.behrens at rackspace.com> wrote:
>> 
>> >
>> >I think Dragon got it right.  We need a zone identifier prefix on the
>> >IDs.  I think we need to get away from numbers.  I don't see any reason
>> >why they need to be numbers.  But, even if they did, you can pick very
>> >large numbers and reserve some bits for zone ID.
>> >
>> >- Chris
>> >
>> >
>> >On Mar 22, 2011, at 10:48 AM, Justin Santa Barbara wrote:
>> >
>> >> I think _if_ we want to stick with straight numbers, the following are
>> >>the 'traditional' choices:
>> >>
>> >> 1) "Skipping" - so zone1 would allocate numbers 1,3,5, zone2 numbers
>> >>2,4,6.  Requires that you know in advance how many zones there are.
>> >> 2) Prefixing - so zone0 would get 0xxxxxxx, zone1 1xxxxxx.
>> >> 3) Central allocation - each zone would request an ID from a central
>> >>pool.  This might not be a bad thing, if you do want to have a quick
>> >>lookup table of ID -> zone.  Doesn't work if the zones aren't under the
>> >>same administrative control.
>> >> 4) Block allocation - a refinement of #3, where you get a bunch of IDs.
>> >> Effectively amortizes the cost of the RPC.  Probably not worth the
>> >>effort here.
>> >>
>> >> (If you want central allocation without a shared database, that's also
>> >>possible, but requires some trickier protocols.)
>> >>
>> >> However, I agree with Monsyne: numeric IDs have got to go.  Suppose I'm
>> >>a customer of Rackspace CloudServers once it is running on OpenStack,
>> >>and I also have a private cloud that the new Rackspace Cloud Business
>> >>unit has built for me.  I like both, and then I want to do cloud
>> >>bursting in between them, by putting an aggregating zone in front of
>> >>them.  I think at that stage, we're screwed unless we figure this out
>> >>now.  And this scenario only has one provider (Rackspace) involved!
>> >>
>> >> We can square the circle however - if we want numbers, let's use UUIDs
>> >>- they're 128 bit numbers, and won't in practice collide.  I'd still
>> >>prefer strings though...
>> >>
>> >> Justin
>> >>
>> >>
>> >>
>> >> On Tue, Mar 22, 2011 at 9:40 AM, Ed Leafe <ed at leafe.com> wrote:
>> >>        I want to get some input from all of you on what you think is
>> >>the best way to approach this problem: the RS API requires that every
>> >>instance have a unique ID, and we are currently creating these IDs by
>> >>use of an auto-increment field in the instances table. The introduction
>> >>of zones complicates this, as each zone has its own database.
>> >>
>> >>        The two obvious solutions are a) a single, shared database and
>> >>b) using a UUID instead of an integer for the ID. Both of these
>> >>approaches have been discussed and rejected, so let's not bring them
>> >>back up now.
>> >>
>> >>        Given integer IDs and separate databases, the only obvious
>> >>choice is partitioning the numeric space so that each zone starts its
>> >>auto-incrementing at a different point, with enough room between
>> >>starting ranges to ensure that they would never overlap. This would
>> >>require some assumptions be made about the maximum number of instances
>> >>that would ever be created in a single zone in order to determine how
>> >>much numeric space that zone would need. I'm looking to get some
>> >>feedback on what would seem to be reasonable guesses to these partition
>> >>sizes.
>> >>
>> >>        The other concern is more aesthetic than technical: we can make
>> >>the numeric spaces big enough to avoid overlap, but then we'll have very
>> >>large ID values; e.g., 10 or more digits for an instance. Computers
>> >>won't care, but people might, so I thought I'd at least bring up this
>> >>potential objection.
>> >>
>> >>
>> >>
>> >> -- Ed Leafe
>> >>
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> Mailing list: https://launchpad.net/~openstack
>> >> Post to     : openstack at lists.launchpad.net
>> >> Unsubscribe : https://launchpad.net/~openstack
>> >> More help   : https://help.launchpad.net/ListHelp
>> >>
>> >> _______________________________________________
>> >> Mailing list: https://launchpad.net/~openstack
>> >> Post to     : openstack at lists.launchpad.net
>> >> Unsubscribe : https://launchpad.net/~openstack
>> >> More help   : https://help.launchpad.net/ListHelp
>> >
>> >
>> >_______________________________________________
>> >Mailing list: https://launchpad.net/~openstack
>> >Post to     : openstack at lists.launchpad.net
>> >Unsubscribe : https://launchpad.net/~openstack
>> >More help   : https://help.launchpad.net/ListHelp
>> 
>> 
>> _______________________________________________
>> Mailing list: https://launchpad.net/~openstack
>> Post to     : openstack at lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
> 
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack at lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp



More information about the Openstack mailing list