[openstack-dev] [nova] [libvirt] Debugging blockRebase() - "active block copy not ready for pivot"

Kashyap Chamarthy kchamart at redhat.com
Thu Oct 6 12:58:51 UTC 2016


On Thu, Oct 06, 2016 at 01:32:39AM +0200, Kashyap Chamarthy wrote:
> TL;DR
> -----
> 
> From the debug analysis of the log below, and discussion with Eric Blake
> of upstream QEMU / libvirt resulted in the below bug report:
> 
>   https://bugzilla.redhat.com/show_bug.cgi?id=1382165 --
>   virDomainGetBlockJobInfo: Adjust job reporting based on QEMU stats & the
>   "ready" field of `query-block-jobs`

When I raised this on libvirt mailing list[0][1], one of the upstream
libvirt devs expressed an NACK in adjusting / "deliberately reporting
false data in block info structure".  Similar concern was also shared by
Matt Booth on #openstack-nova IRC.

Next, turns out the READY event is already exposed via the guest XML[1]:

---------------------------------------------------------------------
We expose the state of the copy job in the XML and forward the READY
event from qemu to the users.

A running copy job exposes itself in the xml as:

    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/var/lib/libvirt/images/systemrescuecd-x86-4.8.0.iso'/>
      <backingStore/>
      <mirror type='file' file='/tmp/ble.img' format='raw' job='copy'>
        <format type='raw'/>
        <source file='/tmp/ble.img'/>
      </mirror>
      [...]
    </disk>

While the ready copy job is exposed as:

    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/var/lib/libvirt/images/systemrescuecd-x86-4.8.0.iso'/>
      <backingStore/>
      <mirror type='file' file='/tmp/ble.img' format='raw' job='copy' ready='yes'>
        <format type='raw'/>
        <source file='/tmp/ble.img'/>
      </mirror>
      [...]
    </disk>


Additionally we have anyncrhronous events that are emitted once qemu
notifies us that the block job has reached sync state or finished.
Libvirt uses the event to switch to the ready state.

The documentation suggests that block jobs should listen to the events
and act accordingly only after receiving the event.
---------------------------------------------------------------------

So, Nova's is_job_complete() method & friends need to be reworked to
listen on the events for job readiness.

[0]
https://www.redhat.com/archives/libvir-list/2016-October/msg00217.html
[1] https://www.redhat.com/archives/libvir-list/2016-October/msg00229.html
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1382165#c3
 
> 
> Details
> -------
> 
> The code in Nova that's being executed is this part in _swap_volume()
> from libvirt/driver.py.
> 
>     [...]
>     # Start copy with VIR_DOMAIN_REBASE_REUSE_EXT flag to
>     # allow writing to existing external volume file
>     dev.rebase(new_path, copy=True, reuse_ext=True)
>     
>     while not dev.is_job_complete():
>         time.sleep(0.5)
>
>     
>     dev.abort_job(pivot=True)
>     [...]
> 

[...]

-- 
/kashyap



More information about the OpenStack-dev mailing list