Open Stack

Mon Nov 25 16:54:51 UTC 2013

Excerpts from Robert Collins's message of 2013-11-25 01:30:11 -0800:
> On 25 November 2013 22:23, Clint Byrum <clint at fewbar.com> wrote:
> 
> > I do wonder if we would be able to commit enough resources to just run
> > two copies of the gate in parallel each time and require both to pass.
> > Doubling the odds* that we will catch an intermittent failure seems like
> > something that might be worth doubling the compute resources used by
> > the gate.
> >
> > *I suck at math. Probably isn't doubling the odds. Sounds
> > good though. ;)
> 
> We already run the code paths that were breaking 8 or more times.
> Hundreds of times in fact for some :(.
> 
> The odds of a broken path triggering after it gets through, assuming
> each time we exercise it is equally likely to show it, are roughly
> 3/times-exercised-in-landing. E.g. if we run a code path 300 times and
> it doesn't show up, then it's quite possible that it has a 1%
> incidence rate.

We don't run through 300 times of the same circumstances. We may pass
through indidivual code paths that have a race condition 300 times, but
the circumstances are probably only right for failure in 1 or 2 of them.

1% overall then, doesn't matter so much as how often does it fail when
the conditions for failure are optimal. If we can increase the ocurrences
of the most likely failure conditions, then we do have a better chance
of catching the failure.

Open Stack

[openstack-dev] Unwedging the gate

OpenStack

Community

Documentation

Branding & Legal