[openstack-dev] Proposal for instance-level snapshots in Nova

Vishvananda Ishaya vishvananda at gmail.com
Wed Jan 29 04:25:14 UTC 2014


On Jan 24, 2014, at 9:05 AM, Jon Bernard <jbernard at tuxion.com> wrote:

> * Vishvananda Ishaya <vishvananda at gmail.com> wrote:
>> 
>> On Jan 16, 2014, at 1:28 PM, Jon Bernard <jbernard at tuxion.com> wrote:
>> 
>>> * Vishvananda Ishaya <vishvananda at gmail.com> wrote:
>>>> 
>>>> On Jan 14, 2014, at 2:10 PM, Jon Bernard <jbernard at tuxion.com> wrote:
>>>> 
>>>>> 
>>>>> <snip>
>>>>>> As you’ve defined the feature so far, it seems like most of it could
>>>>>> be implemented client side:
>>>>>> 
>>>>>> * pause the instance
>>>>>> * snapshot the instance
>>>>>> * snapshot any attached volumes
>>>>> 
>>>>> For the first milestone to offer crash-consistent snapshots you are
>>>>> correct.  We'll need some additional support from libvirt, but the
>>>>> patchset should be straightforward.  The biggest question I have
>>>>> surrounding initial work is whether to use an existing API call or
>>>>> create a new one.
>>>>> 
>>>> 
>>>> I think you might have missed the “client side” part of this point. I agree
>>>> that the snapshot multiple volumes and package it up is valuable, but I was
>>>> trying to make the point that you could do all of this stuff client side
>>>> if you just add support for snapshotting ephemeral drives. An all-in-one
>>>> snapshot command could be valuable, but you are talking about orchestrating
>>>> a lot of commands between nova, glance, and cinder and it could get kind
>>>> of messy to try to run the whole thing from nova.
>>> 
>>> If you expose each primitive required, then yes, the client could
>>> implement the logic to call each primitive in the correct order, handle
>>> error conditions, and exit while leaving everything in the correct
>>> state.  But that would mean you would have to implement it twice - once
>>> in python-novaclient and once in Horizon.  I would speculate that doing
>>> this on the client would be even messier.
>>> 
>>> If you are concerned about the complexity of the required interactions,
>>> we could narrow the focus in this way:
>>> 
>>> Let's say that taking a full snapshot/backup (all volumes) operates
>>> only on persistent storage volumes.  Users who booted from an
>>> ephemeral glance image shouldn't expect this feature because, by
>>> definition, the boot volume is not expected to live a long life.
>>> 
>>> This should limit the communication to Nova and Cinder, while leaving
>>> Glance out (initially).  If the user booted an instance from a cinder
>>> volume, then we have all the volumes necessary to create an OVA and
>>> import to Glance as a final step.  If the boot volume is an image then
>>> I'm not sure, we could go in a few directions:
>>> 
>>> 1. No OVA is imported due to lack of boot volume
>>> 2. A copy of the original image is included as a boot volume to create
>>>    an OVA.
>>> 3. Something else I am failing to see.
>> 
>>> 
>>> If [2] seems plausible, then it probably makes sense to just ask glance
>>> for an image snapshot from nova while the guest is in a paused state.
>>> 
>>> Thoughts?
>> 
>> This already exists. If you run a snapshot command on a volume backed instance
>> it snapshots all attached volumes. Additionally it does throw a bootable image
>> into glance referring to all of the snapshots.  You could modify create image
>> to do this for regular instances as well, specifying block device mapping but
>> keeping the vda as an image. It could even do the same thing with the ephemeral
>> disk without a ton of work. Keeping this all as one command makes a lot of sense
>> except that it is unexpected.
>> 
>> There is a benefit to only snapshotting the root drive sometimes because it
>> keeps the image small. Here’s what I see as the ideal end state:
>> 
>> Two commands(names are a strawman):
>>  create-full-image — image all drives
>>  create-root-image — image just the root drive
>> 
>> These should work the same regardless of whether the root drive is volume backed
>> instead of the craziness we have to day of volume-backed snapshotting all drives
>> and instance backed just the root.  I’m not sure how we manage expectations based
>> on the current implementation but perhaps the best idea is just adding this in
>> v3 with new names?
>> 
>> FYI the whole OVA thing seems moot since we already have a way of representing
>> multiple drives in glance via block_device_mapping properites.
> 
> I've had some time to look closer at nova and rethink things a bit and
> I see what you're saying.  You are correct, taking snapshots of attached
> volumes is currently supported - although not in the way that I would
> like to see.  And this is where I think we can improve.
> 
> Let me first summarize my understanding of what we currently have.
> There are three way of creating a snapshot-like thing in Nova:
> 
>  1. create_image - takes a snapshot of the root volume and may take
>     snapshots of the attached volumes depending on the volume type of
>     the root volume.  I/O is not quiesced.
> 
>  2. create_backup - takes a snapshot of the root volume with options
>     to specify how often to repeat and how many previous snapshots to
>     keep around. I/O is not quiesced.
> 
>  3. os-assisted-snapshot - takes a snapshot of a single cinder volume.
>     The volume is first quiesced before the snapshot is initiated.
> 
> My general thesis is that I/O should be quiesced in all cases if the
> underlying driver supports it.  Libvirt supports this feature and
> I would like to extend the existing functionality to take advantage of
> it.
> 
> It's not reasonable to change the names or behaviour of the existing
> public api calls.  Instead I would like to create a new snapshot() call
> in the v3 API.
> 
> We only need a quiesce() call added to the driver and the rest of the
> implementation will live in the api layer.  Once implemented, the
> existing snapshot calls (image, backup, os-assisted) could use the
> underlying snapshot routines to achieve their expected results.  Leaving
> us with only one set of snapshot-related functions to maintain.
> 
> The new snapshot call would take at least one option: the drives that
> should be snapshotted:
> 
>    snapshot(devices=['vda', 'vdb'])
> 
> Where a value of None implies all volumes.
> 
> This allows the user to snapshot only the root volume if a small
> bootable image is desired.
> 
> There will be no exclusion based on volume type, both glance and cinder
> volumes will be snapshotted respectively.  Otherwise we reach the
> unexpected behaviour that you mentioned earlier and I agree, it would
> have been confusing.
> 
> The flow will look like:
> 
>  * call the compute node to quiesce
>  * call the compute node to snapshot each individual glance drive
>  * call the volume driver to snapshot each cinder volume
>  * package the whole thing
> 
> The final result is an image in glance that references each attached
> volume via its block device mapping.  For a cinder-backed instance, the
> glance image would contain no data and only references to cinder
> snapshots.  As far as I can tell, glance already supports these
> requirements.
> 
> If create_image and create_backup are updated to use this
> implementation, then the behaviour will appear unchanged to the user
> with the exception that I/O was quiesced during the snapshot(s) and they
> therefore have a more reliable and useful result.
> 
> Given this, I think it makes more sense to leave the implementation
> within the api layer of Nova so that existing functions can share in the
> implementation - as opposed to moving it into the client.
> 
> What are your thoughts?  Is this approaching something sensible?

This is starting to look very sensible. I appreciate you putting a lot
of thought into this.

Vish

> 
> -- 
> Jon
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 455 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140128/0d332a98/attachment.pgp>


More information about the OpenStack-dev mailing list