[openstack-dev] [nova] The same SRIOV / NFV CI failures missed a regression, why?
Jay Pipes
jaypipes at gmail.com
Fri Mar 25 19:20:00 UTC 2016
On 03/24/2016 09:35 AM, Matt Riedemann wrote:
> We have another mitaka-rc-potential bug [1] due to a regression when
> detaching SR-IOV interfaces in the libvirt driver.
>
> There were two NFV CIs that ran on the original change [2].
>
> Both failed with the same devstack setup error [3][4].
>
> So it sucks that we have a regression, it sucks that no one watched for
> those CI results before approving the change, and it really sucks in
> this case since it was specifically reported from mellanox for sriov
> which failed in [4]. But it happens.
>
> What I'd like to know is, have the CI problems been fixed? There is a
> change up to fix the regression [5] and this time the Mellanox CI check
> is passing [6]. The Intel NFV CI hasn't reported, but with the mellanox
> one also testing the suspend scenario, it's probably good enough.
From the commit message of the original patch that introduced the
regression:
"This fix was tested on a real environment containing the above type of
VMs. test_driver.test_detach_sriov_ports was slightly modified so that
the VIF from which data is sent to _detach_pci_devices will contain the
correct SRIOV values (pci_slot, vlan and hw_veb VIF type)"
I'm not sure if the above statement could ever have been true
considering the AttributeError that occurred in the bug...
In any case, I think that it's pretty clear that the CI systems for NFV
and PCI have been less than reliable at functionally testing the PCI and
NFV-specific functionality in Nova.
This isn't trying to put down the people that work on those systems -- I
know first hand that it can be difficult to build and maintain CI
systems that report in to upstream, and I appreciate the effort that
goes into this.
But, going forward, I think we need to do something as a concerned
community.
How about this for a proposal?
1) We establish a joint lab environment that contains heterogeneous
hardware to which all interested hardware vendors must provide hardware.
2) The OpenStack Foundation and the hardware vendors each foot some
portion of the bill to hire 2 or more systems administrators to maintain
this lab environment.
3) The upstream Infrastructure team works with the hired system
administrators to create a single CI system that can spawn functional
test jobs on the lab hardware and report results back to upstream Gerrit
Given the will to do this, I think the benefits of more trusted testing
results for the PCI and SR-IOV/NFV areas would more than make up for the
cost.
Best,
-jay
More information about the OpenStack-dev
mailing list