Open Stack

Thu Aug 7 12:08:45 UTC 2014

On 08/06/2014 05:48 PM, John Griffith wrote:
> I have to agree with Duncan here.  I also don't know if I fully
> understand the limit in options.  Stress test seems like it could/should
> be different (again overlap isn't a horrible thing) and I don't see it
> as siphoning off resources so not sure of the issue.  We've become quite
> wrapped up in projects, programs and the like lately and it seems to
> hinder forward progress more than anything else.

Today we have 2 debug domains that developers have to deal with when
tests fails:

 * project level domain (unit tests)
 * cross project (Tempest)

Even 2 debug domains is considered too much for most people, as we get
people that understand one or another, and just throw up their hands
when they are presented with a failure outside their familiar debug domain.

So if Rally was just taken in as a whole, as it exists now, it would
create a 3rd debug domain. It would include running a bunch of tests
that we run in cross project and project level domain, yet again,
written a different way. And when it fails this will be another debug
domain.

I think a 3rd debug domain isn't going to help any of the OpenStack
developers or Operators.

Moving the test payload into Tempest hopefully means getting a more
consistent model for all these tests so when things fail, there is some
common pattern people are familiar with to get to the bottom of things.
As opaque as Tempest runs feel to people, there has been substantial
effort in providing first failure dumps to get as much information about
what's wrong as possible. I agree things could be better, but you will
be starting that work all over from scratch with Rally again.

It also means we could potentially take advantage of the 20,000 Tempest
runs we do every week. We're actually generating a ton of data now that
is not being used for analysis. We're at a point in Tempest development
where to make some data based decisions on which tests need extra
attention, which probably need to get dropped, we need this anyway.

> I'm also not convinced that Tempest is where all things belong, in fact
> I've been thinking more and more that a good bit of what Tempest does
> today should fall more on the responsibility of the projects themselves.
>  For example functional testing of features etc, ideally I'd love to
> have more of that fall on the projects and their respective teams.  That
> might even be something as simple to start as saying "if you contribute
> a new feature, you have to also provide a link to a contribution to the
> Tempest test-suite that checks it".  Sort of like we do for unit tests,
> cross-project tracking is difficult of course, but it's a start.  The
> other idea is maybe functional test harnesses live in their respective
> projects.
> 
> Honestly I think who better to write tests for a project than the folks
> building and contributing to the project.  At some point IMO the QA team
> isn't going to scale.  I wonder if maybe we should be thinking about
> proposals for delineating responsibility and goals in terms of
> functional testing?

I 100% agree in getting some of Tempest existing content out and into
functional tests. Honestly I imagine a Tempest that's 1/2 the # of tests
a year away. Mostly it's going to be about ensuring that projects have
the coverage before we delete the safety nets.

And I 100% agree on getting some better idea on functional boundaries.
But I think that's something we need some practical experience on first.
Setting a policy without figuring out what in practice works is
something I expect wouldn't work so well. My expectation is this is
something we're going to take a few stabs at post J3, and bring into
summit for discussion.

...

So the question is do we think there should be 2 or 3 debug domains for
developers and operators on tests? My feeling is 2 puts us in a much
better place as a community.

The question is should Tempest provide data analysis on it's test runs
or should that be done in completely another program. Doing so in
another program means that all the deficiencies of the existing data get
completely ignored (like variability per run, interactions between
tests, between tests and periodic jobs, difficulty in time accounting of
async ops) to produce some pretty pictures that miss the point, because
they aren't measuring a thing that's real.

And the final question is should Tempest have an easier to understand
starting point than a tox command, like and actual cli for running
things. I think it's probably clear that it should. It would probably
actually make Tempest less big and scary for people.

Because I do think 'do one job and do it well' is completely consistent
with 'run tests across OpenStack projects and present that data in a
consumable way'.

The question basically is whether it's believed that collecting timing
analysis of test results is a separate concern from collecting
correctness results of test results. The Rally team would argue that
they are. I'd argue that they are not.

...

If Rally wants to stay as an ecosystem tool that's not part of the
integrated release, that's one thing. But they current contributors
really want these kind of features baked in. Ok, lets do that with parts
that we who have spent a ton of time making this stuff work in practice
actually understand.

Rally in it's current form is young, has not been run a few million
times, and so hasn't see what happens at the edges of OpenStack. That's
fine and expected. But it also means that all the hard lessons we've
learned about the dangers of autoconfiguration (and accidentally
stopping testing function that should be working because of config
changes), the variability problem of normal distributions of test runs,
the challenges of getting to the bottom of failures when things go
terribly wrong (which they will when run a ton).

Going full steam ahead in a parallel effort here means that all those
lessons will have to be discovered all over again. That's time wasted.

The solutions to address these in integrated projects will now come from
at least 2 different, and competing directions. For instance, right now
Rally is basically proposing a giant amount of Rally specific
instrumentation of all of OpenStack -
https://review.openstack.org/#/c/105096/,
https://review.openstack.org/#/c/103825/

Instrumenting OpenStack is a good idea. But to keep it from turning into
a pile of mud we have to figure out how this relates to other outputs in
the system. Like logging. Does this mean we can drop request logging in
these cases? Or are we going to collect multiple versions of the same
data at different points in the pipeline so the numbers don't add up.

A structured dynamic log event mechanism would actually be incredibly
useful to OpenStack. There is a hint of how you could get there from
osprofiler today, but we should have that conversation, not just go and
merge that whole stack.

...

Because at the end of the day every effort has a cost. It's not just the
cost of the developers working on Rally, it's the coordination costs
that adding another cross project effort into the mix that every program
needs to independently coordinate with. I think we're already stretched
to our limits as a community on cross project coordination. So doing
this another time is something that doesn't feel healthy to me right now.

	-Sean

-- 
Sean Dague
http://dague.net

Open Stack

[openstack-dev] Which program for Rally

OpenStack

Community

Documentation

Branding & Legal