[openstack-dev] [nova] top gate bug is libvirt snapshot

Sean Dague sean at dague.net
Tue Jul 8 22:21:31 UTC 2014


On 07/08/2014 06:12 PM, Joe Gordon wrote:
> 
> 
> 
> On Tue, Jul 8, 2014 at 2:56 PM, Michael Still <mikal at stillhq.com
> <mailto:mikal at stillhq.com>> wrote:
> 
>     The associated bug says this is probably a qemu bug, so I think we
>     should rephrase that to "we need to start thinking about how to make
>     sure upstream changes don't break nova".
> 
> 
> Good point.
>  
> 
> Would running devstack-tempest on the latest upstream release of ? help.
> Not as a voting job but as a periodic (third party?) job, that we can
> hopefully identify these issues early on. I think the big question here
> is who would volunteer to help run a job like this.

The running of the job really isn't the issue.

It's the debugging of the jobs when the go wrong. Creating a new test
job and getting it lit is really < 10% of the work, sifting through the
fails and getting to the bottom of things is the hard and time consuming
part.

The other option is to remove more concurrency from nova-compute. It's
pretty clear that this problem only seems to happen when the
snapshotting is going on at the same time guests are being created or
destroyed (possibly also a second snapshot going on).

This is also why I find it unlikely to be a qemu bug, because that's not
shared state between guests. If qemu just randomly wedges itself, that
would be detectable much easier outside of the gate. And there have been
attempts by danpb to sniff that out, and they haven't worked.

	-Sean

-- 
Sean Dague
http://dague.net

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 482 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140708/86eecc9a/attachment.pgp>


More information about the OpenStack-dev mailing list