[openstack-dev] [nova] The same SRIOV / NFV CI failures missed a regression, why?

Moshe Levi moshele at mellanox.com
Tue Mar 29 07:40:43 UTC 2016



> -----Original Message-----
> From: Jay Pipes [mailto:jaypipes at gmail.com]
> Sent: Friday, March 25, 2016 10:20 PM
> To: openstack-dev at lists.openstack.org
> Subject: Re: [openstack-dev] [nova] The same SRIOV / NFV CI failures missed a
> regression, why?
> 
> On 03/24/2016 09:35 AM, Matt Riedemann wrote:
> > We have another mitaka-rc-potential bug [1] due to a regression when
> > detaching SR-IOV interfaces in the libvirt driver.
> >
> > There were two NFV CIs that ran on the original change [2].
> >
> > Both failed with the same devstack setup error [3][4].
> >
> > So it sucks that we have a regression, it sucks that no one watched
> > for those CI results before approving the change, and it really sucks
> > in this case since it was specifically reported from mellanox for
> > sriov which failed in [4]. But it happens.
> >
> > What I'd like to know is, have the CI problems been fixed? There is a
> > change up to fix the regression [5] and this time the Mellanox CI
> > check is passing [6]. The Intel NFV CI hasn't reported, but with the
> > mellanox one also testing the suspend scenario, it's probably good enough.
> 
>  From the commit message of the original patch that introduced the
> regression:
> 
> "This fix was tested on a real environment containing the above type of VMs.
> test_driver.test_detach_sriov_ports was slightly modified so that the VIF from
> which data is sent to _detach_pci_devices will contain the correct SRIOV values
> (pci_slot, vlan and hw_veb VIF type)"
> 
> I'm not sure if the above statement could ever have been true considering the
> AttributeError that occurred in the bug...
> 
> In any case, I think that it's pretty clear that the CI systems for NFV and PCI
> have been less than reliable at functionally testing the PCI and NFV-specific
> functionality in Nova.
> 
> This isn't trying to put down the people that work on those systems -- I know
> first hand that it can be difficult to build and maintain CI systems that report in
> to upstream, and I appreciate the effort that goes into this.
> 
> But, going forward, I think we need to do something as a concerned
> community.
> 
> How about this for a proposal?
> 
> 1) We establish a joint lab environment that contains heterogeneous hardware
> to which all interested hardware vendors must provide hardware.
> 
> 2) The OpenStack Foundation and the hardware vendors each foot some
> portion of the bill to hire 2 or more systems administrators to maintain this lab
> environment.
> 
> 3) The upstream Infrastructure team works with the hired system
> administrators to create a single CI system that can spawn functional test jobs
> on the lab hardware and report results back to upstream Gerrit
> 
> Given the will to do this, I think the benefits of more trusted testing results for
> the PCI and SR-IOV/NFV areas would more than make up for the cost.
+1 I like this proposal. We can help by providing Mellanox hardware and share our CI knowledge. 

> 
> Best,
> -jay
> 
> ________________________________________________________________
> __________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list