[openstack-dev] Libvirt snapshot process optimization
    Rafi Khardalian 
    rafi at metacloud.com
       
    Thu Aug 23 21:00:47 UTC 2012
    
    
  
Hi all,
I'm looking at the libvirt snapshot code and was wondering about the order
and purpose of several operations.  At a high level, it looks like the VM
being snapshotted is first suspended (managedSave), actual qcow2 snapshot
is taken, then extraction is done (qemu-img convert) before returning the
instance to its prior state.
My question is, with snapshots being atomic, why suspend the VM?  Assuming
there's a reason for this, why not do the qemu-img convert call after the
VM has been resumed?  I figure there are reasons for the current order of
operation and wanted to understand them before making changes.  The goal
here is to reduce the amount of time a VM is unavailable while a snapshot
is being created, as the current approach is rather disruptive for
anything with a large root disk.
The preferred approach, in my mind, is to snapshot without any downtime
whatsoever.  Granted, this relies on the guest being in a consistent
state, which is already the case considering a libvirt suspend doesn't
guarantee anything at the guest OS level.
Any insight would be appreciated.
Thanks,
---
Rafi Khardalian
Vice President, Operations | Metacloud, Inc.
Email: rafi at metacloud.com | Tel: 855-638-2256, Ext. 2662
    
    
More information about the OpenStack-dev
mailing list