[openstack-dev] Proposal for instance-level snapshots in Nova
Jon Bernard
jbernard at tuxion.com
Tue Jan 14 22:10:40 UTC 2014
* Vishvananda Ishaya <vishvananda at gmail.com> wrote:
>
> On Jan 6, 2014, at 3:50 PM, Jon Bernard <jbernard at tuxion.com> wrote:
>
> > Hello all,
> >
> > I would like to propose instance-level snapshots as a feature for
> > inclusion in Nova. An initial draft of the more official proposal is
> > here [1], blueprint is here [2].
> >
> > In a nutshell, this feature will take the existing create-image
> > functionality a few steps further by providing the ability to take
> > a snapshot of a running instance that includes all of its attached
> > volumes. A coordinated snapshot of multiple volumes for backup
> > purposes. The snapshot operation should occur while the instance is in
> > a paused and quiesced state so that each volume snapshot is both
> > consistent within itself and with respect to its sibling snapshots.
> >
> > I still have some open questions on a few topics:
> >
> > * API changes, two different approaches come to mind:
> >
> > 1. Nova already has a command `createImage` for creating an image of an
> > existing instance. This command could be extended to take an
> > additional parameter `all-volumes` that signals the underlying code
> > to capture all attached volumes in addition to the root volume. The
> > semantic here is important, `createImage` is used to create
> > a template image stored in Glance for later reuse. If the primary
> > intent of this new feature is for backup only, then it may not be
> > wise to overlap the two operations in this way. On the other hand,
> > this approach would introduce the least amount of change to the
> > existing API, requiring only modification of an existing command
> > instead of the addition of an entirely new one.
> >
> > 2. If the feature's primary use is for backup purposes, then a new API
> > call may be a better approach, and leave `createImage` untouched.
> > This new call could be called `createBackup` and take as a parameter
> > the name of the instance. Although it introduces a new member to the
> > API reference, it would allow this feature to evolve without
> > introducing regressions in any existing calls. These two calls could
> > share code at some point in the future.
>
> You’ve mentioned “If the feature’s use case is backup” a couple of times
> without specifying the answer. I think this is important to the above
> question. Also relevant is how the snapshot is stored and potentially
> restored.
The question I'm attempting to raise relates to the intended use case.
I can see the feature being valuable in two aspects:
1. A single entry point for creating a consistent snapshot of all of
volumes attached to a compute node. This would make it easy to create
a full backup of a node without having to individually snapshot each
volume. An added benefit here would be that each volume would be
consistent both within itself and with respect to the other volumes
present on the node during the time of the snapshot. The main use
case would be for backup purposes. With this in mind, the
createImage API call might be inappropriate.
2. The same value as stated in 1, except instead of leaving the
snapshot data stored in Cinder, the volumes are joined into
a single OVA file and then imported into Glance. The use case
here would be to create a bootable image of a machine with multiple
volumes.
I can see both use cases being valuable and I'm not positive that we
have to choose between the two. Thoughts and opinions here would be much
appreciated.
> As you’ve defined the feature so far, it seems like most of it could
> be implemented client side:
>
> * pause the instance
> * snapshot the instance
> * snapshot any attached volumes
For the first milestone to offer crash-consistent snapshots you are
correct. We'll need some additional support from libvirt, but the
patchset should be straightforward. The biggest question I have
surrounding initial work is whether to use an existing API call or
create a new one.
> The only thing missing in this scenario is snapshotting any ephemeral
> drives. There are workarounds for this such as:
> * use flavor with no ephemeral storage
> * boot from volume
True, but I think this is a fairly limiting set of constraints.
I suspect users would like to have an instance-level backup option that
works for them regardless of the number of attached volumes or whether
the boot volume is managed by Cinder.
> It is also worth mentioning that snapshotting a boot from volume instance
> will actually do most of this for you (everything but pausing the instance)
> and additionally give you an image which when booted will lead to a clone
> of all of the snapshotted volumes.
>
> So unless there is some additional feature regarding storing or restoring
> the backup, I only see one potential area for improvement inside of nova:
> Modifying the snapshot command to allow for snapshotting of ephemeral
> drives.
I don't want the createImage call to be overloaded such that it does far
more than originally intended. It was my understanding that createImage
was intended to create a template that you can use to spin up instances
from and is referred to by glance (regardless of where it's actually
stored). If the correct approach is to store the snapshots in Cinder
and not import back to Glance, then I think createImage would receive
unfair treatment in this case.
The major feature I'm driving at is guest-assisted snapshots where
mounted filesystems are quiesced and applications are informed of the
snapshot that's about to take place in order to reach a consistent
on-disk state. I would like to take an incremental approach so the
first set of patches would certainly fall short of this goal in terms of
consistency, but the final result I think will be a very nice feature
for Nova.
> If this is an important feature, rather than an all in one command, I
> suggest an extension to createImage which would allow you to specify the
> drive you wish to snapshot. If you could specify drive: vdb in the snapshot
> command it would allow you to snapshot all the components individually.
Snapshotting each volume individually is only half of the added value.
The other half is that the compute node is paused and all attached
volumes quiesced. A snapshot of each volume is taken during this state
so that the resulting snapshot reflects the exact state of all volumes
at the time of the snapshot. We can take this a small step further by
using a guest agent to achieve application-consistent snapshots. And
I think this would be a killer feature to have.
--
Jon
More information about the OpenStack-dev
mailing list