[openstack-dev] [nova] fair standards for all hypervisor drivers

Sean Dague sean at dague.net
Wed Jul 16 14:15:40 UTC 2014

Recently the main gate updated from Ubuntu 12.04 to 14.04, and in doing
so we started executing the livesnapshot code in the nova libvirt
driver. Which fails about 20% of the time in the gate, as we're bringing
computes up and down while doing a snapshot. Dan Berange did a bunch of
debug on that and thinks it might be a qemu bug. We disabled these code
paths, so live snapshot has now been ripped out.

In January we also triggered a libvirt bug, and had to carry a private
build of libvirt for 6 weeks in order to let people merge code in OpenStack.

We never were able to switch to libvirt 1.1.1 in the gate using the
Ubuntu Cloud Archive during Icehouse development, because it has a
different set of failures that would have prevented people from merging

Based on these experiences, libvirt version differences seem to be as
substantial as major hypervisor differences. There is a proposal here -
https://review.openstack.org/#/c/103923/ to hold newer versions of
libvirt to the same standard we hold xen, vmware, hyperv, docker,
ironic, etc.

I'm somewhat concerned that the -2 pile on in this review is a double
standard of libvirt features, and features exploiting really new
upstream features. I feel like a lot of the language being used here
about the burden of doing this testing is exactly the same as was
presented by the docker team before their driver was removed, which was
ignored by the Nova team at the time. It was the concern by the freebsd
team, which was also ignored and they were told to go land libvirt
patches instead.

I'm ok with us as a project changing our mind and deciding that the test
bar needs to be taken down a notch or two because it's too burdensome to
contributors and vendors, but if we are doing that, we need to do it for
everyone. A lot of other organizations have put a ton of time and energy
into this, and are carrying a maintenance cost of running these systems
to get results back in a timely basis.

As we seem deadlocked in the review, I think the mailing list is
probably a better place for this.

If we want to reduce the standards for libvirt we should reconsider
what's being asked of 3rd party CI teams, and things like the docker
driver, as well as the A, B, C driver classification. Because clearly
libvirt 1.2.5+ isn't actually class A supported.

Anyway, discussion welcomed. My primary concern right now isn't actually
where we set the bar, but that we set the same bar for everyone.


Sean Dague

