[openstack-dev] Proposal for instance-level snapshots in Nova

Jon Bernard jbernard at tuxion.com
Fri Jan 24 17:05:31 UTC 2014


* Vishvananda Ishaya <vishvananda at gmail.com> wrote:
> 
> On Jan 16, 2014, at 1:28 PM, Jon Bernard <jbernard at tuxion.com> wrote:
> 
> > * Vishvananda Ishaya <vishvananda at gmail.com> wrote:
> >> 
> >> On Jan 14, 2014, at 2:10 PM, Jon Bernard <jbernard at tuxion.com> wrote:
> >> 
> >>> 
> >>> <snip>
> >>>> As you’ve defined the feature so far, it seems like most of it could
> >>>> be implemented client side:
> >>>> 
> >>>> * pause the instance
> >>>> * snapshot the instance
> >>>> * snapshot any attached volumes
> >>> 
> >>> For the first milestone to offer crash-consistent snapshots you are
> >>> correct.  We'll need some additional support from libvirt, but the
> >>> patchset should be straightforward.  The biggest question I have
> >>> surrounding initial work is whether to use an existing API call or
> >>> create a new one.
> >>> 
> >> 
> >> I think you might have missed the “client side” part of this point. I agree
> >> that the snapshot multiple volumes and package it up is valuable, but I was
> >> trying to make the point that you could do all of this stuff client side
> >> if you just add support for snapshotting ephemeral drives. An all-in-one
> >> snapshot command could be valuable, but you are talking about orchestrating
> >> a lot of commands between nova, glance, and cinder and it could get kind
> >> of messy to try to run the whole thing from nova.
> > 
> > If you expose each primitive required, then yes, the client could
> > implement the logic to call each primitive in the correct order, handle
> > error conditions, and exit while leaving everything in the correct
> > state.  But that would mean you would have to implement it twice - once
> > in python-novaclient and once in Horizon.  I would speculate that doing
> > this on the client would be even messier.
> > 
> > If you are concerned about the complexity of the required interactions,
> > we could narrow the focus in this way:
> > 
> >  Let's say that taking a full snapshot/backup (all volumes) operates
> >  only on persistent storage volumes.  Users who booted from an
> >  ephemeral glance image shouldn't expect this feature because, by
> >  definition, the boot volume is not expected to live a long life.
> > 
> > This should limit the communication to Nova and Cinder, while leaving
> > Glance out (initially).  If the user booted an instance from a cinder
> > volume, then we have all the volumes necessary to create an OVA and
> > import to Glance as a final step.  If the boot volume is an image then
> > I'm not sure, we could go in a few directions:
> > 
> >  1. No OVA is imported due to lack of boot volume
> >  2. A copy of the original image is included as a boot volume to create
> >     an OVA.
> >  3. Something else I am failing to see.
> 
> > 
> > If [2] seems plausible, then it probably makes sense to just ask glance
> > for an image snapshot from nova while the guest is in a paused state.
> > 
> > Thoughts?
> 
> This already exists. If you run a snapshot command on a volume backed instance
> it snapshots all attached volumes. Additionally it does throw a bootable image
> into glance referring to all of the snapshots.  You could modify create image
> to do this for regular instances as well, specifying block device mapping but
> keeping the vda as an image. It could even do the same thing with the ephemeral
> disk without a ton of work. Keeping this all as one command makes a lot of sense
> except that it is unexpected.
> 
> There is a benefit to only snapshotting the root drive sometimes because it
> keeps the image small. Here’s what I see as the ideal end state:
> 
> Two commands(names are a strawman):
>   create-full-image — image all drives
>   create-root-image — image just the root drive
> 
> These should work the same regardless of whether the root drive is volume backed
> instead of the craziness we have to day of volume-backed snapshotting all drives
> and instance backed just the root.  I’m not sure how we manage expectations based
> on the current implementation but perhaps the best idea is just adding this in
> v3 with new names?
> 
> FYI the whole OVA thing seems moot since we already have a way of representing
> multiple drives in glance via block_device_mapping properites.

I've had some time to look closer at nova and rethink things a bit and
I see what you're saying.  You are correct, taking snapshots of attached
volumes is currently supported - although not in the way that I would
like to see.  And this is where I think we can improve.

Let me first summarize my understanding of what we currently have.
There are three way of creating a snapshot-like thing in Nova:

  1. create_image - takes a snapshot of the root volume and may take
     snapshots of the attached volumes depending on the volume type of
     the root volume.  I/O is not quiesced.

  2. create_backup - takes a snapshot of the root volume with options
     to specify how often to repeat and how many previous snapshots to
     keep around. I/O is not quiesced.

  3. os-assisted-snapshot - takes a snapshot of a single cinder volume.
     The volume is first quiesced before the snapshot is initiated.

My general thesis is that I/O should be quiesced in all cases if the
underlying driver supports it.  Libvirt supports this feature and
I would like to extend the existing functionality to take advantage of
it.

It's not reasonable to change the names or behaviour of the existing
public api calls.  Instead I would like to create a new snapshot() call
in the v3 API.

We only need a quiesce() call added to the driver and the rest of the
implementation will live in the api layer.  Once implemented, the
existing snapshot calls (image, backup, os-assisted) could use the
underlying snapshot routines to achieve their expected results.  Leaving
us with only one set of snapshot-related functions to maintain.

The new snapshot call would take at least one option: the drives that
should be snapshotted:

    snapshot(devices=['vda', 'vdb'])

Where a value of None implies all volumes.

This allows the user to snapshot only the root volume if a small
bootable image is desired.

There will be no exclusion based on volume type, both glance and cinder
volumes will be snapshotted respectively.  Otherwise we reach the
unexpected behaviour that you mentioned earlier and I agree, it would
have been confusing.

The flow will look like:

  * call the compute node to quiesce
  * call the compute node to snapshot each individual glance drive
  * call the volume driver to snapshot each cinder volume
  * package the whole thing

The final result is an image in glance that references each attached
volume via its block device mapping.  For a cinder-backed instance, the
glance image would contain no data and only references to cinder
snapshots.  As far as I can tell, glance already supports these
requirements.

If create_image and create_backup are updated to use this
implementation, then the behaviour will appear unchanged to the user
with the exception that I/O was quiesced during the snapshot(s) and they
therefore have a more reliable and useful result.

Given this, I think it makes more sense to leave the implementation
within the api layer of Nova so that existing functions can share in the
implementation - as opposed to moving it into the client.

What are your thoughts?  Is this approaching something sensible?

-- 
Jon



More information about the OpenStack-dev mailing list