[openstack-dev] [glance][nova]improvement-of-accessing-to-glance

Robert Collins robertc at robertcollins.net
Tue Feb 4 09:21:23 UTC 2014


On 4 February 2014 07:59, Mark Washenberger
<mark.washenberger at markwash.net> wrote:

> 1) Deprecate the registry deployment (done when v1 is deprecated)
> 2) v2 glance api talks directly to the underlying database (done)
> 3) Create a library in the images program that allows OpenStack projects to
> share code for reading image data remotely and picking optimal paths for
> bulk data transfer (In progress under the "glance.store" title)
> 4) v2 exposes locations that clients can directly access (partially done,
> continues to need a lot of improvement)
> 5) v2 still allows downloading images from the glance server as a
> compatibility and lowest-common-denominator feature
>
> In 4, some work is complete, and some more is planned, but we still need
> some more planning and design to figure out how to support directly
> downloading images in a secure and general way.

So this ties into an Ironic future feature... we'd like to do
something for mass distribution of identical images - without file
injection [yay] we have the same bits to distribute to many [thousands
perhaps] servers spread over a short period of time.

We need to measure and prototype around which technology to use
(bittorrent/multicast/bit-fountains ...) - for arguments sake, lets
paint this shed multicast.

Where would you see a multicast image distribution service fitting in?
Part of images? New service? Adjunct to existing service?

Basic characteristics we'd be looking at:
 - running on a trusted network [either no direct end users, or SDN -
we DHCP and PXE boot off of it...]
   - assume we don't need encryption in v1, but we would need sha1
verification of what we got out, somehow.
 - we don't know all the desired recipients - they will trickle in as
heat deploys machines and moves onto deploying more throughout a
rolling update
 - we want to drive the recipient drives at full speed - say anywhere
up to 5Gbps [linear IO to a high end fusion IO device or similar] but
probably more typically 1-2Gbps [linear I/O to 1-drive spinning
platter setup]

Whats in my head is an API where we can say:
 - please stream image X in format Y over multicast and tell me the
multicast port and cookie for it
 - please stop streaming image X in format Y

And the API server would reference count the requests, so if there two
starts and one stop, it would keep streaming, in a loop.

The stream format would be something like:
[8 byte cookie][8 byte offset][remainder of frame is content]
with the special offset of ffffffffffffffff being used to pass
checksums and metadata around:
[8 byte cookie][0xff*8][json with checksums as k:v] (e.g.
{'sha1'='xxxxx', 'len'=123445}

Recipients would join the multicast group, then read packets and write
them to local disk at the offset given, sanity checking against
partition table of course :), and keep a bitmap of uncopied data. Once
the bitmap shows that everything <= len has been copied, it checks the
checksum (which by definition it has received or it couldn't check len
:) against the disk contents and then signals Ironic that its is
complete, and Ironic can signal the API that that use of the stream is
not needed anymore.

We could of course write this in Ironic - copy the image locally to
the needed format (raw for all the current drivers), manage the
multicast stuff etc, but it seems to me that this is something that
would be useful for VM environments too - upload a new image, deploy
10K VM's running hadoop off of it - multicast/p2p is a great way to
reduce hotspots.

-Rob

-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Converged Cloud



More information about the OpenStack-dev mailing list