[openstack-dev] [cinder] 3rd Party CI failures ignored, caused driver to break

John Griffith john.griffith8 at gmail.com
Thu Feb 26 20:24:52 UTC 2015


On Thu, Feb 26, 2015 at 1:10 PM, Walter A. Boring IV <walter.boring at hp.com>
wrote:

> Hey folks,
>    Today we found out that a patch[1] that was made against our lefthand
> driver caused the driver to fail.   The 3rd party CI that we have setup
> to test our driver (hp_lefthand_rest_proxy.py) caught the CI failure and
> reported it correctly[2].  The patch that broke the driver was reviewed and
> approved without a mention of 3rd Party CI failures.
>
> This is a perfect example of 3rd Party CI working perfectly and catching a
> failure, and being completely ignored by everyone
> involved in the review process for the patch
>
> I know that 3rd party CI isn't perfect, and has been ripe with false
> failures, which is one of the reasons why they aren't voting today.
> But, that being said, if patch submitters aren't even looking at the
> failures for CI when they are touching drivers that they don't maintain,
> and reviewers
> aren't looking at the CI failures, then why are we even doing 3rd party CI?
>
> Our team is partially responsible for not seeing the failure as well.  We
> should be watching the CI failures closely, but we are doing the best we
> can.  There are enough patches for Cinder ongoing at any one time, that we
> simply can't watch every single one of them for failures. We did eventually
> see that every single patchset in gerrit was now failing against our
> driver, and this is how we caught it.  Yes, it was a bit after the fact,
> but we did notice
> it and now have a new patch up that fixes it.   So, in that regard 3rd
> party CI did eventually vet out a problem that our team caught.
>
> How can we prevent this in the future?
> 1) Make 3rd party CI voting.  I don't believe this is ready yet.
>

​Agreed, look at the history on that driver (and many others) and you'll
see we are in no way ready for that.​


2) Authors and reviews need to look at 3rd party CI failures when a patch
> touches a driver.  If a failure is seen, contact the CI maintainer and work
> with them and
> see if the failure is related to the patch, if it's not obvious. In this
> case, the failure was obvious.  The import changed, and now the package
> can't find the module.
>

​If things were more "stable" yeah, I might, but the reality is as I've
pointed out we have a serious signal to noise ratio problem IMO
​


> 3) CI maintainers watch every single patchset and report -1's on reviews?
> (ouch)
> 4) ?
>

​Option 4 in my opinion is exactly what I've been doing since this
started.  I receive a notification for any change that my CI setup fails
on, and then it's up to me to go and verify if something is truly broken or
if it's my system that's messed up.  It's not perfect and it's not really a
true CI but it is a continuous test system which for me is what's most
important here.  I completely understand that's not the case for other
things.

This is where the churn of typo fixes, hacking changes etc can bight us.
Sure, while reviewing that probably could've/should've been caught.  The
problem is for me at least if we're going to do this sorts of semantic
changes that touch so many files I'm likely to be half asleep before I get
to the "cinder/volume" section.  Doesn't make it right, but kinda how it is.

Anyway, yeah it sucks, but I'd argue this worked out GREAT.  A change was
made that broke things in LHN driver, but historically nobody would've
known until a customer tried to use it.  In this case, the problem was
found in less than a day and fixed.  That's a pretty huge win in my opinion!

Thanks,
John​


>
>
>
> Here is the patch that broke the lefthand driver[1]
> Here is the reported failure in the c-vol log for the patch by our 3rd
> party CI system[2]
> Here is my new patch that fixes the lefthand driver again.[3]
>
> [1] https://review.openstack.org/#/c/145780/
> [2] http://15.126.198.151/80/145780/15/check/lefthand-
> iscsi-driver-master-client-pip-dsvm/3927e3d/logs/screen-
> c-vol.txt.gz?level=ERROR
> [3] https://review.openstack.org/#/c/159586/
>
> $0.02
> Walt
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150226/1cb2bd8a/attachment.html>


More information about the OpenStack-dev mailing list