[openstack-dev] Libvirt snapshot process optimization
Rafi Khardalian
rafi at metacloud.com
Thu Aug 23 21:00:47 UTC 2012
Hi all,
I'm looking at the libvirt snapshot code and was wondering about the order
and purpose of several operations. At a high level, it looks like the VM
being snapshotted is first suspended (managedSave), actual qcow2 snapshot
is taken, then extraction is done (qemu-img convert) before returning the
instance to its prior state.
My question is, with snapshots being atomic, why suspend the VM? Assuming
there's a reason for this, why not do the qemu-img convert call after the
VM has been resumed? I figure there are reasons for the current order of
operation and wanted to understand them before making changes. The goal
here is to reduce the amount of time a VM is unavailable while a snapshot
is being created, as the current approach is rather disruptive for
anything with a large root disk.
The preferred approach, in my mind, is to snapshot without any downtime
whatsoever. Granted, this relies on the guest being in a consistent
state, which is already the case considering a libvirt suspend doesn't
guarantee anything at the guest OS level.
Any insight would be appreciated.
Thanks,
---
Rafi Khardalian
Vice President, Operations | Metacloud, Inc.
Email: rafi at metacloud.com | Tel: 855-638-2256, Ext. 2662
More information about the OpenStack-dev
mailing list