[openstack-dev] Which program for Rally

Sean Dague sean at dague.net
Wed Aug 6 13:44:12 UTC 2014


On 08/06/2014 09:11 AM, Russell Bryant wrote:
> On 08/06/2014 06:30 AM, Thierry Carrez wrote:
>> Hi everyone,
>>
>> At the TC meeting yesterday we discussed Rally program request and
>> incubation request. We quickly dismissed the incubation request, as
>> Rally appears to be able to live happily on top of OpenStack and would
>> benefit from having a release cycle decoupled from the OpenStack
>> "integrated release".
>>
>> That leaves the question of the program. OpenStack programs are created
>> by the Technical Committee, to bless existing efforts and teams that are
>> considered *essential* to the production of the "OpenStack" integrated
>> release and the completion of the OpenStack project mission. There are 3
>> ways to look at Rally and official programs at this point:
>>
>> 1. Rally as an essential QA tool
>> Performance testing (and especially performance regression testing) is
>> an essential QA function, and a feature that Rally provides. If the QA
>> team is happy to use Rally to fill that function, then Rally can
>> obviously be adopted by the (already-existing) QA program. That said,
>> that would put Rally under the authority of the QA PTL, and that raises
>> a few questions due to the current architecture of Rally, which is more
>> product-oriented. There needs to be further discussion between the QA
>> core team and the Rally team to see how that could work and if that
>> option would be acceptable for both sides.
>>
>> 2. Rally as an essential operator tool
>> Regular benchmarking of OpenStack deployments is a best practice for
>> cloud operators, and a feature that Rally provides. With a bit of a
>> stretch, we could consider that benchmarking is essential to the
>> completion of the OpenStack project mission. That program could one day
>> evolve to include more such "operations best practices" tools. In
>> addition to the slight stretch already mentioned, one concern here is
>> that we still want to have performance testing in QA (which is clearly
>> essential to the production of "OpenStack"). Letting Rally primarily be
>> an operational tool might make that outcome more difficult.
>>
>> 3. Let Rally be a product on top of OpenStack
>> The last option is to not have Rally in any program, and not consider it
>> *essential* to the production of the "OpenStack" integrated release or
>> the completion of the OpenStack project mission. Rally can happily exist
>> as an operator tool on top of OpenStack. It is built as a monolithic
>> product: that approach works very well for external complementary
>> solutions... Also be more integrated in OpenStack or part of the
>> OpenStack programs might come at a cost (slicing some functionality out
>> of rally to make it more a framework and less a product) that might not
>> be what its authors want.
>>
>> Let's explore each option to see which ones are viable, and the pros and
>> cons of each.
> 
> My feeling right now is that Rally is trying to accomplish too much at
> the start (both #1 and #2).  I would rather see the project focus on
> doing one of them as best as it can before increasing scope.
> 
> It's my opinion that #1 is the most important thing that Rally can be
> doing to help ensure the success of OpenStack, so I'd like to explore
> the "Rally as a QA tool" in more detail to start with.

I want to clarify some things. I don't think that rally in it's current
form belongs in any OpenStack project. It's a giant monolythic tool,
which is apparently a design point. That's the wrong design point for an
OpenStack project.

For instance:

https://github.com/stackforge/rally/tree/master/rally/benchmark/scenarios should
all be tests in Tempest (and actually today mostly are via API tests).
There is an existing stress framework in Tempest which does the
repetitive looping that rally does on these already. This fact has been
brought up before.

https://github.com/stackforge/rally/tree/master/rally/verification/verifiers
- should be baked back into Tempest (at least on the results side,
though diving in there now it looks largely duplicative from existing
subunit to html code).

https://github.com/stackforge/rally/blob/master/rally/db/api.py - is
largely (not entirely) what we'd like from a long term trending piece
that subunit2sql is working on. Again this was just all thrown into the
Rally db instead of thinking about how to split it off. Also, notable
here is there are some fundamental testr bugs (like worker
misallocation) which mean the data is massively dirty today. It would be
good for people to actually work on fixing those things.

The parts that should stay outside of Tempest are the setup tool
(separation of concerns is that Tempest is the load runner, not the
setup environment) and any of the SLA portions.

I think rally brings forward a good point about making Tempest easier to
run. But I think that shouldn't be done outside Tempest. Making the test
tool easier to use should be done in the tool itself. If that means
adding a tempest cmd or such, so be it. Note this was a topic for
discussion at last summit:
https://wiki.openstack.org/wiki/Summit/Juno/Etherpads#QA

> From the TC meeting, it seems that the QA group (via sdague, at least)
> has provided some feedback to Rally over the last several months.  I
> would really like to see an analysis and write-up from the QA group on
> the current state of Rally and how it may (or may not) be able to serve
> the performance QA needs.

Something that we need to figure out is given where we are in the
release cycle do we want to ask the QA team to go off and do Rally deep
dive now to try to pull it apart into the parts that make sense for
other programs to take in. There are always trade offs.

Like the fact that right now the rally team is proposing gate jobs which
have some overlap to the existing largeops jobs. Did they start a
conversation about it? Nope. They just went off to do their thing
instead. https://review.openstack.org/#/c/112251/

So now we're going to run 2 jobs that do very similar things, with
different teams adjusting the test loads. Which I think is basically
madness.

	-Sean

-- 
Sean Dague
http://dague.net



More information about the OpenStack-dev mailing list