[openstack-dev] [Nova] [Cinder] [Tempest] Regarding deleting snapshot when instance is OFF
jordan.pittier at scality.com
Wed Jun 17 15:32:17 UTC 2015
On Tue, Jun 16, 2015 at 3:33 PM, Jordan Pittier <jordan.pittier at scality.com>
> On Thu, Apr 9, 2015 at 6:10 PM, Eric Blake <eblake at redhat.com> wrote:
>> On 04/08/2015 11:22 PM, Deepak Shetty wrote:
>> > + [Cinder] and [Tempest] in the $subject since this affects them too
>> > On Thu, Apr 9, 2015 at 4:22 AM, Eric Blake <eblake at redhat.com> wrote:
>> >> On 04/08/2015 12:01 PM, Deepak Shetty wrote:
>> >>> Questions:
>> >>> 1) Is this a valid scenario being tested ? Some say yes, I am not
>> >>> since the test makes sure that instance is OFF before snap is deleted
>> >>> this doesn't work for fs-backed drivers as they use hyp assisted snap
>> >> which
>> >>> needs domain to be active.
>> >> Logically, it should be possible to delete snapshots when a domain is
>> >> off (qemu-img can do it, but libvirt has not yet been taught how to
>> >> manage it, in part because qemu-img is not as friendly as qemu in
>> >> a re-connectible Unix socket monitor for tracking long-running
>> > Is there a bug/feature already opened for this ?
>> Libvirt has this bug: https://bugzilla.redhat.com/show_bug.cgi?id=987719
>> which tracks generic ability of libvirt to delete snapshots; ideally,
>> the code to manage snapshots will work for both online and persistent
>> offline guests, but it may result in splitting the work into multiple
> I can't access this bug report, it seems "private", I need to authenticate.
>> > I didn't understand much
>> > on what you
>> > mean by re-connectible unix socket :)... are you hinting that qemu-img
>> > doesn't have
>> > ability to attach to a qemu / VM process for long time over unix socket
>> For online guest control, libvirt normally creates a Unix socket, then
>> starts qemu with its -qmp monitor pointing to that socket. That way, if
>> libvirtd goes away and then restarts, it can reconnect as a client to
>> the existing socket file, and qemu never has to know that the person on
>> the other end changed. With that QMP monitor, libvirt can query qemu's
>> current state at will, get event notifications when long-running jobs
>> have finished, and issue commands to terminate long-running jobs early,
>> even if it is a different libvirtd issuing a later command than the one
>> that started the command.
>> qemu-img, on the other hand, only has the -p option or SIGUSR1 signal
>> for outputting progress to stderr on a long-running operation (not the
>> most machine-parseable), but is not otherwise controllable. It does not
>> have a management connection through a Unix socket. I guess in thinking
>> about it a bit more, a Unix socket is not essential; as long as the old
>> libvirtd starts qemu-img in a manner that tracks its pid and collects
>> stderr reliably, then restarting libvirtd can send SIGUSR1 to the pid
>> and track the changes to stderr to estimate how far along things are.
>> Also, the idea has been proposed that qemu-img is not necessary; libvirt
>> could use qemu -M none to create a dummy machine with no CPUs and JUST
>> disk images, and then use the qemu QMP monitor as usual to perform block
>> operations on those disks by reusing the code it already has working for
>> online guests. But even this approach needs coding into libvirt.
>> Eric Blake eblake redhat com +1-919-301-3266
>> Libvirt virtualization library http://libvirt.org
>> OpenStack Development Mailing List (not for usage questions)
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> I'd like to progress on this issue, so I will spend some time on it.
> Let's recap. The issue is "deleting a Cinder snapshot that was created
> during an Nova Instance snapshot (booted from a cinder volume) doesn't work
> when the original Nova Instance is stopped". This bug only arises when a
> Cinder driver uses the feature called "QEMU Assisted
> Snapshots"/live-snapshot. (currently only GlusterFS, but soon generic NFS
> when https://blueprints.launchpad.net/cinder/+spec/nfs-snapshots gets in).
> This issue is triggered by the Tempest scenario
> "test_volume_boot_pattern". This scenario:
> [does some stuff]
> 1) Creates a cinder volume from an Cirros Image
> 2) Boot a Nova Instance on the volume
> 3) Make a snapshot of this instance (which creates a cinder snapshot
> because the instance was booted from a volume), using the feature QEMU
> Assisted Snapshots
> [do some other stuff]
> 4) stop the instance created in step 2 then delete the snapshot created in
> step 3.
> The deletion of snapshot created in step 3 fails because Nova wants
> libvirt to do a blockRebase (see
> For reference, there's a bug targeting Cinder for this :
> What I'd like to do, but I am asking your advice first is:
> Just before doing the call to virt_dom.blockRebase(), check if the domain
> is running, and if not call "qemu-img rebase -b $rebase_base rebase_disk".
> (this idea was brought up by Eric Blake in the previous reply).
> Is it safe to do so ?
> Is it the right approach ? (given that I don't really want to wait for
> libvirt to support blockRebase on offline domain)
> Thanks a lot !
I went ahead and proposed the following patch :
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OpenStack-dev