[openstack-dev] [gate] Automatic elastic rechecks

Daniel P. Berrange berrange at redhat.com
Fri Jul 18 14:09:34 UTC 2014


On Fri, Jul 18, 2014 at 09:06:45AM -0500, Matt Riedemann wrote:
> 
> 
> On 7/17/2014 9:01 AM, Matthew Booth wrote:
> >Elastic recheck is a great tool. It leaves me messages like this:
> >
> >===
> >I noticed jenkins failed, I think you hit bug(s):
> >
> >check-devstack-dsvm-cells: https://bugs.launchpad.net/bugs/1334550
> >gate-tempest-dsvm-large-ops: https://bugs.launchpad.net/bugs/1334550
> >
> >We don't automatically recheck or reverify, so please consider doing
> >that manually if someone hasn't already. For a code review which is not
> >yet approved, you can recheck by leaving a code review comment with just
> >the text:
> >
> >     recheck bug 1334550
> >
> >For bug details see: http://status.openstack.org/elastic-recheck/
> >===
> >
> >In an ideal world, every person seeing this would diligently check that
> >the fingerprint match was accurate before submitting a recheck request.
> >
> >In the real world, how about we just do it automatically?
> >
> >Matt
> >
> 
> We don't want automatic rechecks because then we're just piling on to races,
> because you can have jenkins failures where we have a fingerprint for one
> job failure but there is some other job failing on your patch which is an
> unrecognized failure (no e-r fingerprint query yet).  If we never force
> people to investigate the failures and write fingerprints because we're just
> always automatically rechecking things for them, we'll drop our
> categorization rates and most likely eventually fall into a locked gate once
> we hit 2-3 really nasty races hitting at the same time.

If there were multiple failures and only some were identified, it would
be reasonable to *not* automatically recheck. 

Given that we have issues with resources available to the gate it would
also seems like a benefit to allow us to only recheck the actual jobs
which fail. ie if 1 job fails, don't recheck all 8 jobs because that is
just wasting resource and increases the chances of failing again, and
again and again which wastes more resources and everyone's time.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|



More information about the OpenStack-dev mailing list