[openstack-qa] tempest run length - need a gate tag - call for help

Sean Dague sean at dague.net
Mon May 13 19:51:52 UTC 2013


On 05/13/2013 02:52 PM, James E. Blair wrote:
> Sean Dague <sean at dague.net> writes:
>
>> Any assistance would be good.
>>
>> Right now we really just need 'gate' attr added to basically all the
>> non skipped methods, we can prune later. Once 'gate' looks to be ~
>> full, we can flip over check and gate to use that.
>>
>> I think long term the approach we're going to need to go with is 3
>> sets of tests:
>>
>> smoke (< 10 mins)
>> gate (< 45 mins)
>> full (everything)
>>
>> All projects gate on gate
>>
>> Periodic runs of full - daily, more often?
>>
>> Tempest check runs full (but not gate), it's advisory.
>>
>> Some on demand facility for people to run full.
>>
>> At this point I'm not adding my +2 to any more tests (only approving
>> fixes to existing tests) until we get gate tag in, as I don't think we
>> should be running any longer than we currently are.
>
> We discussed this at the summit, and while running fewer tests is
> certainly one of the things we can do, I don't remember consensus that
> it was our first priority.

It's not really about running fewer tests, it's about test growth. 
Tempest isn't just a tool for the gate, lots of people want to use it on 
their clouds. So at some point we do need to divorce the gate set from 
the whole set. Otherwise the tolerable length of the OpenStack gate 
limits the content people can contribute to test their real world clouds.

We already have instances of this, stress tests, which are in tempest 
but not in the gate. David and I had some lunch time conversations about 
this one day and I think that we were both of the mindset that:
  * not everything in tempest has to be in the gate
  * however, everything in tempest has to get run on some interval for 
bitrot reasons (which we aren't currently doing)

> We have a number of other things that we can do to reduce run-time that
> I think we agreed should be a higher priority:
>
> A) Parallelize the test runner (move to testr).
> B) Split the run into multiple jobs (XML vs JSON, etc).
> C) Focus on flakey tests so that gate resets are less of a factor
>     (reducing sensitivity to runtime).
>
> Note that work on both A and B independently facilitates C.
>
> I think the general direction we'd like to head is to run _more_ tests,
> not less.  Further, I don't think that check jobs and gate jobs should
> run different tests -- some people will learn to just ignore check jobs
> and enqueue failing jobs into the gate (as people already ignore
> non-voting jobs), resulting in more bad code landing.  It's also
> optimizing the wrong pipeline -- developers are more sensitive to slow
> check jobs than gate jobs.
>
> I got the impression that we all agreed that testr was the highest
> priority for this, and I'd still like to see that land before we move on
> to functional job splits.  Is that effort progressing?  What can we do
> to help?

A is currently stalled for lack of anyone working on it for H1. Chris 
Yeoh was the driving force in Grizzly on this, but he's hard at work on 
the Nova v3 API right now. Matt Treinish's going to pick this up on H2 
if no one else steps forward.

B currently has the issue that test criteria selection kind of sucks 
because of the lack of structure in the tempest tree. I'm currently 
working on this one to get it done by H1 - 
https://review.openstack.org/#/q/status:open+project:openstack/tempest+branch:master+topic:bp/tempest-repo-restructure,n,z

C is currently probably a later in cycle task, just for time reasons.

So B we should be able to do ~ H1 timeframe, we'll have the test suite 
in chunks that are easy to run bits and pieces of.

A should be ~H2 (depending).

But that's still a lot of weeks where we have test case contributors 
largely blocked because run time is an issue. Having a gate tag that we 
can annotate is just another lever. It also provides us with an excuse 
to get the multi tag support into the test cases, because developers 
really do want to run things like:

./run_tests.sh -t cinder  (or something similar)

And get just the slice of tests that has a cinder interaction.

	-Sean

-- 
Sean Dague
http://dague.net



More information about the openstack-qa mailing list