[OpenStack-Infra] Rechecking OpenStack CI

Joshua Hesketh joshua.hesketh at rackspace.com
Sat Aug 16 01:49:46 UTC 2014


I completely agree that the expectation should be that all CI systems 
pass consistently on a good patch. The problem is this isn't the case. 
For people fighting the dozens of CI systems on nova or neutron this 
causes a real problem. Where issuing rechecks will eventually get the 
good state and while doing so puts the 1st party or some other CI out of 
a good state.

While we are battling these race conditions and simply unstable systems 
I feel it would be good to be able to kick specific ones to help things 
move along faster. The reduced load on all CI's is an added benefit.

Cheers,
Josh

Rackspace Australia

On 8/12/14 11:17 PM, Jeremy Stanley wrote:
> On 2014-08-12 17:43:37 +1000 (+1000), Joshua Hesketh wrote:
>> Right, and even if we did change all the 3rd parties over to
>> something that doesn't start with 'recheck' there will be
>> inconsistencies between 3rd parties and 1st party.
> [...]
>
> If this is really something the various CI operators want to support
> (rather than merely something which has been cargo-culted as a
> behavior they're expected to emulate), I think we need a separate
> syntax for it which lacks the encumbrance of the currently ambiguous
> "recheck" ad-hoc language which has grown arbitrarily by convention.
>
> However, I also don't personally think that rechecking one specific
> CI makes sense, and would rather just have them all run every time
> you say "recheck" since otherwise you end up pin-and-tumbler
> lockpicking racy bugs (selectively rerererecheck each CI until you
> get a good run and then hold that result while you work through the
> rest of them in turn).
>
> I'm less worried about the small amount of resource waste in the
> upstream OpenStack project infrastructure from people getting jobs
> rerun when they recheck for some failed third-party result. The
> expectation we ultimately need to set is that every CI should pass
> pretty much all of the time on a good change and if it doesn't then
> it must be fixed (upstream CI included). Allowing you to selectively
> rerun jobs from one system reinforces the idea that it's okay to
> fail frequently as long as devs are eventually able to coax out a
> passing run.




More information about the OpenStack-Infra mailing list