[openstack-dev] [nova] fair standards for all hypervisor drivers

Clark Boylan clark.boylan at gmail.com
Wed Jul 16 15:12:47 UTC 2014


On Wed, Jul 16, 2014 at 7:50 AM, Daniel P. Berrange <berrange at redhat.com> wrote:
> On Wed, Jul 16, 2014 at 04:15:40PM +0200, Sean Dague wrote:
>> Recently the main gate updated from Ubuntu 12.04 to 14.04, and in doing
>> so we started executing the livesnapshot code in the nova libvirt
>> driver. Which fails about 20% of the time in the gate, as we're bringing
>> computes up and down while doing a snapshot. Dan Berange did a bunch of
>> debug on that and thinks it might be a qemu bug. We disabled these code
>> paths, so live snapshot has now been ripped out.
>>
>> In January we also triggered a libvirt bug, and had to carry a private
>> build of libvirt for 6 weeks in order to let people merge code in OpenStack.
>>
>> We never were able to switch to libvirt 1.1.1 in the gate using the
>> Ubuntu Cloud Archive during Icehouse development, because it has a
>> different set of failures that would have prevented people from merging
>> code.
>>
>> Based on these experiences, libvirt version differences seem to be as
>> substantial as major hypervisor differences.
>
> I think that is a pretty dubious conclusion to draw from just a
> couple of bugs. The reason they really caused pain is that because
> the CI test system was based on old version for too long. If it
> were tracking current upstream version of libvirt/KVM we'd have
> seen the problem much sooner & been able to resolve it during
> review of the change introducing the feature, as we do with any
> other bugs we encounter in software such as the breakage we see
> with my stuff off pypi.
>
How do you suggest we do this effectively with libvirt? In the past we
have tried to use newer versions of libvirt and they completely broke.
And the time to fixing that was non trivial. For most of our pypi
stuff we attempt to fix upstream and if that does not happen quickly
we pin (arguably we don't do this well either, see the sqlalchemy<=0.7
issues of the past).

I am worried that we would just regress to the current process because
we have tried something similar to this previously and were forced to
regress to the current process.
>
>>                                             There is a proposal here -
>> https://review.openstack.org/#/c/103923/ to hold newer versions of
>> libvirt to the same standard we hold xen, vmware, hyperv, docker,
>> ironic, etc.
>
> That is rather misleading statement you're making there. Libvirt is
> in fact held to *higher* standards than xen/vmware/hypver because it
> is actually gating all commits. The 3rd party CI systems can be
> broken for days, weeks and we still happily accept code for those
> virt. drivers.
>
> AFAIK there has never been any statement that every feature added
> to xen/vmware/hyperv must be tested by the 3rd party CI system.
> All of the CI systems, for whatever driver, are currently testing
> some arbitrary subset of the overall features of that driver, and
> by no means every new feature being approved in review has coverage.
>
>> I'm somewhat concerned that the -2 pile on in this review is a double
>> standard of libvirt features, and features exploiting really new
>> upstream features. I feel like a lot of the language being used here
>> about the burden of doing this testing is exactly the same as was
>> presented by the docker team before their driver was removed, which was
>> ignored by the Nova team at the time. It was the concern by the freebsd
>> team, which was also ignored and they were told to go land libvirt
>> patches instead.
>
> As above the only double standard is that libvirt tests are all gating
> and 3rd party tests are non-gating.
>
>> If we want to reduce the standards for libvirt we should reconsider
>> what's being asked of 3rd party CI teams, and things like the docker
>> driver, as well as the A, B, C driver classification. Because clearly
>> libvirt 1.2.5+ isn't actually class A supported.
>
> AFAIK the requirement for 3rd party CI is merely that it has to exist,
> running some arbitrary version of the hypervisor in question. We've
> not said that 3rd party CI has to be covering every version or every
> feature, as is trying to be pushed on libvirt here.
>
> The "Class A", "Class B", "Class C" classifications were always only
> ever going to be a crude approximation. Unless you define them to be
> wrt the explicit version of every single deb/pypi package installed
> in the gate system (which I don't believe anyone has every suggested)
> there is always risk that a different version of some package has a
> bug that Nova tickles.
>
> IMHO the classification we do for drivers provides an indication as
> to the quality of the *Nova* code. IOW class A indicates that we've
> throughly tested the Nova code and believe it to be free of bugs for
> the features we've tested. If there is a bug in a 3rd party package
> that doesn't imply that the Nova code is any less well tested or
> more buggy. Replace libvirt with mysql in your example above. A new
> version of mysql with a bug does not imply that Nova is suddenly not
> "class A" tested.
>
> IMHO it is upto the downstream vendors to run testing to ensure that
> what they give to their customers, still achieves the quality level
> indicated by the tests upstream has performed on the Nova code.
>
>> Anyway, discussion welcomed. My primary concern right now isn't actually
>> where we set the bar, but that we set the same bar for everyone.
>
> As above, aside from the question of gating vs non-gating, the bar is
> already set at the same level of everyone. There has to be a CI system
> somewhere testing some arbitrary version of the software. Everyone meets
> that requirement.
>
> Regards,
> Daniel
> --
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Clark



More information about the OpenStack-dev mailing list