[openstack-dev] libvirt race in openstack ci gate
Sean Dague
sean at dague.net
Mon May 20 18:54:46 UTC 2013
This came up in -qa when some folks were trying to debug a Quantum patch
that was failing, but seemingly unrelated -
http://logs.openstack.org/29184/8/gate/gate-tempest-devstack-vm-quantum/23537/
It looks like there is a race in nova-compute around trying to spin up
guests around a libvirt fail/race.
http://logs.openstack.org/29184/8/gate/gate-tempest-devstack-vm-quantum/23537/logs/screen-n-cpu.txt.gz
The critical part is:
2013-05-20 15:47:13.461 DEBUG nova.openstack.common.rpc.amqp
[req-ddd6a6f2-7a52-49d3-8545-9e035aeb0134 demo demo] UNIQUE_ID is
3e22c5ab4a264091b3a53572a4e5c518. _add_unique_id
/opt/stack/new/nova/nova/openstack/common/rpc/amqp.py:337
2013-05-20 15:47:13.480 DEBUG nova.openstack.common.lockutils
[req-ddd6a6f2-7a52-49d3-8545-9e035aeb0134 demo demo] Got semaphore
"3e9b1297-caf1-4daf-8127-919b8ba68fc4" for method "do_run_instance"...
inner /opt/stack/new/nova/nova/openstack/common/lockutils.py:190
libvir: QEMU error : Domain not found: no domain with matching name
'instance-0000000b'
2013-05-20 15:47:13.485 AUDIT nova.compute.manager
[req-ddd6a6f2-7a52-49d3-8545-9e035aeb0134 demo demo] [instance:
3e9b1297-caf1-4daf-8127-919b8ba68fc4] Starting instance...
2013-05-20 15:47:13.485 DEBUG nova.openstack.common.rpc.amqp
[req-ddd6a6f2-7a52-49d3-8545-9e035aeb0134 demo demo] Making synchronous
call on conductor ... multicall
/opt/stack/new/nova/nova/openstack/common/rpc/amqp.py:586
(that libvir: line which I put the line breaks around to highlight)
After that happens the guest is left in a BUILD state, we never check
back in with livirt, that causes a timeout while we wait for the guest
to go to ACTIVE, which then causes a fail on Tempest.
I remember seeing these issues previously, but not for a while. Any
libvirt experts willing to weigh in on this?
-Sean
--
Sean Dague
http://dague.net
More information about the OpenStack-dev
mailing list