[openstack-dev] Gate Status - Friday Edition

Sean Dague sean at dague.net
Fri Jan 24 13:31:43 UTC 2014

Things are still not good, but they are getting better.

Current Gate Stats:
 * Gate Queue Depth - 79
 * Check Queue Depth - 18
 * Top of gate entered - ?? (we did a couple zuul restarts, so numbers
here are inaccurate)
 * Gate Fail Categorization Rate: 73%

== Major Classes of Issues ==

The biggest class of issues causing gate resets right now is Unit test
race conditions. And unit test failures currently seems to be trumping
Tempest failures in the gate.

 * Swift and Glance still have races in their unit tests in Master.
 * Nova looks fixed in master, however stable/* changes are flowing now,
and the unit test fixes have not yet been backported. That should
probably be a priority (and no more stable/* patches approved until it's

http://status.openstack.org/elastic-recheck/ for the latest sorted hit
list of things to tackle.

We also have the list of all the gate fails that are not categorized -

Help appreciated.

== Changes that are Helping ==

=== Zuul Sliding Window ===

We are now rate limitting the gate queue on a sliding window model,
which is definitely helping with thrashing, and means we aren't seeing
the giant delays in getting check results spun up. Which is huge.

=== New Nodes from RAX ===

In combination with the Sliding window fixes, we now have plenty of
capacity. This means time to wait to get a new d-g node is actually
quite small (at worse 10 - 20 minutes). All of which is good.

== Changes in Queue ==

We also have a change in queue to stop testing Nova v3 XML api, which
should give us back 5 - 10 minutes per tempest run, making the whole
system faster.

I want to thank everyone that's been helping us get things back under
control. The whole infra team: Jim, Clark, Jeremy, Monty. Russell and
Matt Riedeman on the Nova side. Anita and Salvatore on the Neutron side.
Joe for keeping the categorization rate high enough that we can see
what's killing us. And I'm sure many many more folks that I've
forgotten. It's been a pretty wild week, so apologies if you were left out.


Sean Dague
Samsung Research America
sean at dague.net / sean.dague at samsung.com

