[openstack-dev] Which program for Rally

Matthew Treinish mtreinish at kortar.org
Thu Aug 14 21:15:22 UTC 2014


On Wed, Aug 13, 2014 at 03:48:59PM -0600, Duncan Thomas wrote:
> On 13 August 2014 13:57, Matthew Treinish <mtreinish at kortar.org> wrote:
> > On Tue, Aug 12, 2014 at 01:45:17AM +0400, Boris Pavlovic wrote:
> >> Keystone, Glance, Cinder, Neutron and Heat are running rally performance
> >> jobs, that can be used for performance testing, benchmarking, regression
> >> testing (already now). These jobs supports in-tree plugins for all
> >> components (scenarios, load generators, benchmark context) and they can use
> >> Rally fully without interaction with Rally team at all. More about these
> >> jobs:
> >> https://docs.google.com/a/mirantis.com/document/d/1s93IBuyx24dM3SmPcboBp7N47RQedT8u4AJPgOHp9-A/
> >> So I really don't see anything like this in tempest (even in observed
> >> future)
> 
> > So this is actually the communication problem I mentioned before. Singling out
> > individual projects and getting them to add a rally job is not "cross project"
> > communication. (this is part of what I meant by "push using Rally") There was no
> > larger discussion on the ML or a topic in the project meeting about adding these
> > jobs. There was no discussion about the value vs risk of adding new jobs to the
> > gate. Also, this is why less than half of the integrated projects have these
> > jobs. Having asymmetry like this between gating workloads on projects helps no
> > one.
> 
> So the advantage of the approach, rather than having a massive
> cross-product discussion, is that interested projects (I've been very
> interested for a cinder core PoV) act as a test bed for other
> projects. 'Cross project' discussions rather come to other teams, they
> rely on people to find them, where as Boris came to us, said I've got
> this thing you might like, try it out, tell me what you want. He took
> feedback, iterated fast and investigated bugs. It has been a genuine
> pleasure to work with him, and I feel we made progress faster than we
> would have done if it was trying to please everybody.

I'm not arguing whether Boris was great to work with or not. Or whether there
isn't value in talking directly to the dev team when setting up a new job. That
is definitely the fastest path to getting a new job up and running. But, for
something like adding a new class of dsvm job which runs on every patch, that
affects everyone, not just the project where the jobs are being added. A larger
discussion is really necessary to weigh whether such a job should be added. It
really only needs to happen once, just before the first one is added on an
integrated project.

> 
> > That being said the reason I think osprofiler has been more accepted and it's
> > adoption into oslo is not nearly as contentious is because it's an independent
> > library that has value outside of itself. You don't need to pull in a monolithic
> > stack to use it. Which is a design point more conducive with the rest of
> > OpenStack.
> 
> Sorry, are you suggesting tempest isn't a giant monolithic thing?
> Because I was able to comprehend the rally code very quickly, that
> isn't even slightly true of tempest. Having one simple tool that does
> one thing well is exactly what rally has tried to do - tempest seems
> to want to be five different things at once (CI, instalation tests,
> trademark, preformance, stress testing, ...)

This is actually a common misconception about the purpose and role of Tempest.
Tempest is strictly concerned with being the integration test suite for
OpenStack, which just includes the actual tests and some methods of running the
tests. This is attempted to be done in a manner which is independent of the
environment in which tempest is run or run against. (for example, devstack vs a
public cloud) Yes tempest is a large project and has a lot of tests which just
adds to it's complexity, but it's scope is quite targeted. It's just that it
grows at the same rate OpenStack scope grows because tempest has coverage for
all the projects.

Methods of running the tests does include the stress tests framework, but that
is mostly just a method of leveraging the large quantity of tests we currently
have in-tree to generate load. [1] (Yeah, we need to write better user docs
around this and a lot of other things) It just lets you define which tests to
use and how to loop and distribute them over workers. [2] 

The trademark, CI, upgrade testing, and installation testing are just examples
of applications where tempest is being used. (some of which are the domain of
other QA or Infra program projects, some are not) If you look in the tempest
tree you'll see very little specifically about any of those applications.
They're all mostly accomplished by building tooling around tempest. For example:
refstack->trademark, devstack-gate->ci, grenade->upgrade, etc. Tempest is just a
building block that can be used to make all of those things. As all of these
different use cases are basically tempest's primary consumer we do have to
take them into account when working on improving tempest to try and not upset
our "users". But, they are not explicit goals of the tempest project by itself.

One thing did just occur to me while writing this though it's probably worth
investigating splitting out the stress test framework as an external
tool/project after we start work on the tempest library. [3]


<snip> 
>>
>> So firstly, I want to say I find these jobs troubling. Not just from the fact
>> that because of the nature of the gate (2nd level virt on public clouds) the
>> variability between jobs can be staggering. I can't imagine what value there is
>> in running synthetic benchmarks in this environment. It would only reliably
>> catch the most egregious of regressions. Also from what I can tell none of these
>> jobs actually compare the timing data to the previous results, it just generates
>> the data and makes a pretty graph. The burden appears to be on the user to
>> figure out what it means, which really isn't that useful. How have these jobs
>> actually helped? IMO the real value in performance testing in the gate is to
>> capture the longer term trends in the data. Which is something these jobs are
>> not doing.
>
> 
> So I put in a change to dump out the raw data from each run into a
> zipped json file so that I can start looking at the value of
> collecting this data.... As an experiment I think it is very worth
> while. The gate job is none voting, and apparently, at least on the
> cinder front, highly reliable. The job runs fast enough it isn't
> slowing the gate down - we aren't running out of nodes on the gate as
> far as I can tell, so I don't understand the hostility towards it.
> We'll run it for a bit, see if it proves useful, if it doesn't then we
> can turn it off and try something else.
> 
> I'm confused by the hostility about this gate job - it is costing us
> nothing, if it turns out to be a pain we'll turn it off.

So aside from longer term issues that have already been brought up, as a short
term experiment I agree it's probably fine. However, the argument that it costs
us nothing isn't exactly the case. During the course of a day we routinely
exhaust the max quota on nodes. Adding additional dsvm jobs just contributes to
this exhaustion. While this probably isn't very noticeable when the gate is in a
healthy state. (it just makes things a bit slower) However, when things aren't
working this becomes a huge problem, because node churn slows down our ability
to land fixes which just makes recovery slower. Given that we have a finite
number of gate resources I don't really think we should be running something
that's considered an experiment on every patch. Instead it should be in the
experimental pipeline for the project so it can be run on-demand, but doesn't
constantly eat resources.

> 
> Rally as a general tool has enabled me do do things that I wouldn't
> even consider trying with tempest. There shouldn't be a problem with a
> small number of parallel efforts - that's a founding principle of
> opensource in general.

So, I agree there is nothing wrong with doing things as a separate tool in a
parallel effort, which was Thierry's option #3 on the original post. The
question up for debate is whether we should adopt Rally into the QA program or
create a new program for it. My opinion on the matter, for the reasons I've
already outlined in my other posts, is no, not in it's current form. I'd really
like to see collaboration here working towards eventual integration of Rally
into the QA program. But, I don't think there is anything wrong with the rally
dev team deciding to continue as they are outside of an OpenStack program.

-Matt Treinish

[1] http://docs.openstack.org/developer/tempest/field_guide/stress.html#stress-field-guide
[2] http://git.openstack.org/cgit/openstack/tempest/tree/tempest/stress/etc/sample-unit-test.json
[3] http://specs.openstack.org/openstack/qa-specs/specs/tempest-library.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140814/eb72ff21/attachment.pgp>


More information about the OpenStack-dev mailing list