[openstack-dev] [Neutron] How to handle blocking bugs/changes in Neutron 3rd party CI

Dane Leblanc (leblancd) leblancd at cisco.com
Thu Aug 21 12:20:35 UTC 2014


That makes sense for setups that don’t use Zuul.

But for setups using Zuul/Jenkins, and for a vendor who is introducing a new plugin which has initial hardware-enabling commits which haven’t been merged yet, I don’t see how we can meet Neutron 3rd party testing requirements. The requirements and the tools just seem to be at odds in this situation.

From: Kevin Benton [mailto:blak111 at gmail.com]
Sent: Thursday, August 21, 2014 3:25 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Neutron] How to handle blocking bugs/changes in Neutron 3rd party CI

I'm not sure if this is possible with a Zuul setup, but once we identify a failure causing commit, we change the reported job status to "(skipped)" for any patches that contain the commit but not the fix. It's a relatively straightforward way to communicate that the CI system is still operational but voting was intentionally bypassed. This logic is handled in the script that determines and posts the results of the tests to gerrit.

On Wed, Aug 20, 2014 at 3:27 PM, Dane Leblanc (leblancd) <leblancd at cisco.com<mailto:leblancd at cisco.com>> wrote:
Preface: I posed this problem on the #openstack-infra IRC, and they couldn't offer an easy or obvious solution, and suggested that I get some consensus from the Neutron community as to how we want to handle this situation. So I'd like to bounce this around, get some ideas, and maybe bring this up in the 3rd party CI IRC.

The challenge is this: Occasionally, a blocking bug is introduced which causes our 3rd party CI tests to consistently fail on every change set that we're testing against. We can develop a fix for the problem, but until that fix gets merged upstream, tests against all other change sets are seen to fail.

(Note that we have a similar situation whenever we introduce a completely new plugin with its associated 3rd party CI... until the plugin code, or an "enabling" subset of that plugin code is merged upstream, then typically all other commits would fail on that CI setup.)

In the past, we've tried dynamically patching the fix(es) on top of the fetched code being reviewed, but this isn't always reliable due to merge conflicts, and we've had to monkey patch DevStack to apply the fixes after cloning Neutron but before installing Neutron.

So we'd prefer to enter a "throttled" or "filtering" CI mode when we hit this situation, where we're (temporarily) only testing against commits related to our plugin/driver which contain (or have a dependency on) the fix for the blocking bug until the fix is merged.

In an ideal world, for the sake of transparency, we would love to be able to have Jenkins/Zuul report back to Gerrit with a descriptive test result such as "N/A", "Not tested", or even "Aborted" for all other change sets, letting the committer know that, "Yeah, we see your review, but we're unable to test it at the moment." Zuul does have the ability to report "Aborted" status to Gerrit, but this is sent e.g. when Zuul decides to abort change set 'N' for a review when change set 'N+1' has just been submitted, or when a Jenkins admin manually aborts a Jenkins job.  Unfortunately, this type of status is not available programmatically within a Jenkins job script; the only outcomes are pass (zero RC) or fail (non-zero RC). (Note that we can't directly filter at the Zuul level in our topology, since we have one Zuul server servicing multiple 3rd party CI setups.)

As a second option, we'd like to not run any tests for the other changes, and report NOTHING to Gerrit, while continuing to run against changes related to our plugin (as required for the plugin changes to be approved).  This was the favored approach discussed in the Neutron IRC on Monday. But herein lies the rub. By the time our Jenkins job script discovers that the change set that is being tested is not in a list of preferred/allowed change sets, the script has 2 options: pass or fail. With the current Jenkins, there is no programmatic way for a Jenkins script to signal to Gearman/Zuul that the job should be aborted.

There was supposedly a bug filed with Jenkins to allow it to interpret different exit codes from job scripts as different result values, but this hasn't made any progress.

There may be something that can be changed in Zuul to allow it to interpret different result codes other than success/fail, or maybe to allow Zuul to do change ID filtering on a per Jenkins job basis, but this would require the infra team to make changes to Zuul.

The bottom line is that based on the current Zuul/Jenkins infrastructure, whenever our 3rd party CI is blocked by a bug, I'm struggling with the conflicting requirements:
* Continue testing against change sets for the blocking bug (or plugin related changes)
* Don't report anything to Gerrit for all other change sets, since these can't be meaningfully tested against the CI hardware

Let me know if I'm missing a solution to this. I appreciate any suggestions!

-Dane


_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--
Kevin Benton
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140821/2f0a7b7a/attachment.html>


More information about the OpenStack-dev mailing list