[openstack-dev] [all] OpenStack races piling up in the gate - please stop approving patches unless they are fixing a race condition

Anita Kuno anteaya at anteaya.info
Thu Jun 5 20:21:34 UTC 2014


On 06/05/2014 03:59 PM, Derek Higgins wrote:
> On 05/06/14 13:07, Sean Dague wrote:
>> You may all have noticed things are really backed up in the gate right
>> now, and you would be correct. (Top of gate is about 30 hrs, but if you
>> do the math on ingress / egress rates the gate is probably really double
>> that in transit time right now).
>>
>> We've hit another threshold where there are so many really small races
>> in the gate that they are compounding to the point where fixing one is
>> often failed by another one killing your job. This whole situation was
>> exacerbated by the fact that while the transition from HP cloud 1.0 ->
>> 1.1 was happening and we were under capacity, the check queue grew to
>> 500 with lots of stuff being approved.
>>
>> That flush all hit the gate at once. But it also means that those jobs
>> passed in a very specific timing situation, which is different on the
>> new HP cloud nodes. And the normal statistical distribution of some jobs
>> on RAX and some on HP that shake out different races didn't happen.
>>
>> At this point we could really use help getting focus on only recheck
>> bugs. The current list of bugs is here:
>> http://status.openstack.org/elastic-recheck/
> 
> Hitting that link gives a different page when compared to navigating to
> the "Rechecks" tag from http://status.openstack.org and I can't find a
> way to navigate to the page you linked, is this intentional?
> 
> just curious, ignore me if I'm distracting from the current issues.
>
The elastic-recheck page is different from the rechecks page, so yes
navigating from status.openstack.org takes you to rechecks and that is
intentional. It is intentional inasmuch as elastic-recheck is still
considered a work in progress (so wear your hard hat) rather than ready
for public viewing (bring your camera). It is more about managing
expectations than anything.

Thanks,
Anita.
>>
>> Also our categorization rate is only 75% so there are probably at least
>> 2 critical bugs we don't even know about yet hiding in the failures.
>> Helping categorize here -
>> http://status.openstack.org/elastic-recheck/data/uncategorized.html
>> would be handy.
>>
>> We're coordinating changes via an etherpad here -
>> https://etherpad.openstack.org/p/gatetriage-june2014
>>
>> If you want to help, jumping in #openstack-infra would be the place to go.
>>
>> 	-Sean
>>
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
> 
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 




More information about the OpenStack-dev mailing list