[openstack-dev] Libvirt snapshot process optimization

Joshua Harlow harlowja at yahoo-inc.com
Fri Aug 24 00:12:50 UTC 2012


I'd almost like to see the VM be shutdown before snapshot, but that¹s just
me. 

In fact just looking at the libvirt docs, 'suspend does not save a
persistent image of the guest's memory. For this, save is used.' So that
could leave guests in some weird state, so that sort of sucks. A shutdown
could at least trigger ACPI shutdown to occur in the VM and would
hopefully leave it in a ok state (emphasis on hopefully). I just think
that reducing the amount of time is going to be hard without
hypervisor<->vm communication (ie signaling all the apps in the vm to
stop) or libvirt (+others) needs to persist the memory image.

My guess is suspend is trying to do what it can, which won't be 100% right
without memory state saving or some other communication happening...
Perhaps a 'save' call (or shutdown sequence) should be used, but this
probably isn't any faster, but at least it would be 'correct' (shared
storage state not included). There is also the question of uploading
snapshots (but that¹s a different question).

On 8/23/12 2:00 PM, "Rafi Khardalian" <rafi at metacloud.com> wrote:

>Hi all,
>
>I'm looking at the libvirt snapshot code and was wondering about the order
>and purpose of several operations.  At a high level, it looks like the VM
>being snapshotted is first suspended (managedSave), actual qcow2 snapshot
>is taken, then extraction is done (qemu-img convert) before returning the
>instance to its prior state.
>
>My question is, with snapshots being atomic, why suspend the VM?  Assuming
>there's a reason for this, why not do the qemu-img convert call after the
>VM has been resumed?  I figure there are reasons for the current order of
>operation and wanted to understand them before making changes.  The goal
>here is to reduce the amount of time a VM is unavailable while a snapshot
>is being created, as the current approach is rather disruptive for
>anything with a large root disk.
>
>The preferred approach, in my mind, is to snapshot without any downtime
>whatsoever.  Granted, this relies on the guest being in a consistent
>state, which is already the case considering a libvirt suspend doesn't
>guarantee anything at the guest OS level.
>
>Any insight would be appreciated.
>
>Thanks,
>---
>Rafi Khardalian
>Vice President, Operations | Metacloud, Inc.
>Email: rafi at metacloud.com | Tel: 855-638-2256, Ext. 2662
>
>_______________________________________________
>OpenStack-dev mailing list
>OpenStack-dev at lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




More information about the OpenStack-dev mailing list