[openstack-dev] [tc][rally][qa] Application for a new OpenStack Program: Performance and Scalability
Jay Pipes
jaypipes at gmail.com
Mon Aug 4 16:25:15 UTC 2014
On 08/04/2014 11:21 AM, Boris Pavlovic wrote:
> Rally is quite monolithic and can't be split
I think this is one of the roots of the problem that folks like David
and Sean keep coming around to. If Rally were less monolithic, it
would be easier to say "OK, bring this piece into Tempest, have this
piece be a separate library and live in the QA program, and have the
service endpoint that allows operators to store and periodically measure
SLA performance indicators against their cloud."
Incidentally, this is one of the problems that Scalr faced when applying
for incubation, years ago, and one of the reasons the PPB at the time
voted not to incubate Scalr: it had a monolithic design that crossed too
many different lines in terms of duplicating functionality that already
existed in a number of other projects.
> , that is why I think Rally should be a separated program (i.e.
> Rally scope is just different from QA scope). As well, It's not clear
> for me, why collaboration is possible only in case of one program? In
> my opinion collaboration & programs are irrelevant things.
Sure, it's certainly possible for collaboration to happen across
programs. I think what Sean is alluding to is the fact that the Tempest
and Rally communities have done little collaboration to date, and that
is worrying to him.
> About collaboration between Rally & Tempest teams... Major goal of
> integration Tempest in Rally is to make it simpler to use tempest on
> production clouds via OpenStack API.
Plenty of folks run Tempest without Rally against production clouds as
an acceptance test platform. I see no real benefit to arguing that Rally
is for running against production clouds and Tempest is for
non-production clouds. There just isn't much of a difference there.
That said, an Operator Tools program is actually an entirely different
concept -- with a different audience and mission from the QA program. I
think you've seen here some initial support for such a proposed Operator
Tools program.
The problem I see is that Rally is not *yet* exposing the REST service
endpoint that would make it a full-fledged Operator Tool outside the
scope of its current QA focus. Once Rally does indeed publish a REST API
that exposes resource endpoints for an operator to store a set of KPIs
associated with an SLA, and allows the operator to store the run
schedule that Rally would use to go and test such metrics, *then* would
be the appropriate time to suggest that Rally be the pilot project in
this new Operator Tools program, IMO.
> This work requires a lot of collaboration between teams, as you
> already mention we should work on improving measuring durations and
> tempest.conf generation. I fully agree that this belongs to Tempest.
> By the way, Rally team is already helping with this part.
>
> In my opinion, end result should be something like: Rally just calls
> Tempest (or couple of scripts from tempest) and store results to its
> DB, presenting to end user tempest functionality via OpenStack API.
> To get this done, we should implement next features in tempest: 1)
> Auto tempest.conf generation 2) Production ready cleanup - tempest
> should be absolutely safe for run against cloud 3) Improvements
> related to time measurement. 4) Integration of OSprofiler & Tempest.
I'm sure all of those things would be welcome additions to Tempest. At
the same time, Rally contributors would do well to work on an initial
REST API endpoint that would expose the resources I denoted above.
Best,
-jay
> So in any case I would prefer to continue collaboration..
>
> Thoughts?
>
>
> Best regards, Boris Pavlovic
>
>
>
>
> On Mon, Aug 4, 2014 at 4:24 PM, Sean Dague <sean at dague.net
> <mailto:sean at dague.net>> wrote:
>
> On 07/31/2014 06:55 AM, Angus Salkeld wrote:
>> On Sun, 2014-07-27 at 07:57 -0700, Sean Dague wrote:
>>> On 07/26/2014 05:51 PM, Hayes, Graham wrote:
>>>> On Tue, 2014-07-22 at 12:18 -0400, Sean Dague wrote:
>>>>> On 07/22/2014 11:58 AM, David Kranz wrote:
>>>>>> On 07/22/2014 10:44 AM, Sean Dague wrote:
>>>>>>> Honestly, I'm really not sure I see this as a different
> program, but is
>>>>>>> really something that should be folded into the QA
>>>>>>> program.
> I feel like
>>>>>>> a top level effort like this is going to lead to a lot
>>>>>>> of
> duplication in
>>>>>>> the data analysis that's currently going on, as well as
> functionality
>>>>>>> for better load driver UX.
>>>>>>>
>>>>>>> -Sean
>>>>>> +1 It will also lead to pointless discussions/arguments
>>>>>> about which activities are part of "QA" and which are part
>>>>>> of "Performance and Scalability Testing".
>>>>
>>>> I think that those discussions will still take place, it will
> just be on
>>>> a per repository basis, instead of a per program one.
>>>>
>>>> [snip]
>>>>
>>>>>
>>>>> Right, 100% agreed. Rally would remain with it's own repo +
> review team,
>>>>> just like grenade.
>>>>>
>>>>> -Sean
>>>>>
>>>>
>>>> Is the concept of a separate review team not the point of a
> program?
>>>>
>>>> In the the thread from Designate's Incubation request Thierry
> said [1]:
>>>>
>>>>> "Programs" just let us bless goals and teams and let them
>>>>> organize code however they want, with contribution to any
>>>>> code repo
> under that
>>>>> umbrella being considered "official" and
>>>>> ATC-status-granting.
>>>>
>>>> I do think that this is something that needs to be clarified
>>>> by
> the TC -
>>>> Rally could not get a PTL if they were part of the QA project,
> but every
>>>> time we get a program request, the same discussion happens.
>>>>
>>>> I think that mission statements can be edited to fit new
> programs as
>>>> they occur, and that it is more important to let teams that
> have been
>>>> working closely together to stay as a distinct group.
>>>
>>> My big concern here is that many of the things that these
> efforts have
>>> been doing are things we actually want much closer to the base.
>>> For instance, metrics on Tempest runs.
>>>
>>> When Rally was first created it had it's own load generator. It
> took a
>>> ton of effort to keep the team from duplicating that and instead
> just
>>> use some subset of Tempest. Then when measuring showed up, we
> actually
>>> said that is something that would be great in Tempest, so
> whoever ran
>>> it, be it for Testing, Monitoring, or Performance gathering,
> would have
>>> access to that data. But the Rally team went off in a corner and
> did it
>>> otherwise. That's caused the QA team to have to go and redo this
> work
>>> from scratch with subunit2sql, in a way that can be consumed by
> multiple
>>> efforts.
>>>
>>> So I'm generally -1 to this being a separate effort on the basis
> that so
>>> far the team has decided to stay in their own sandbox instead of
>>> participating actively where many of us thing the functions
> should be
>>> added. I also think this isn't like Designate, because this isn't
>>> intended to be part of the integrated release.
>>
>> From reading Boris's email it seems like rally will provide a
>> horizon panel and api to back it (for the operator to kick of
>> performance
> runs
>> and view stats). So this does seem like something that would be a
>> part of the integrated release (if I am reading things correctly).
>>
>> Is the QA program happy to extend their scope to include that? QA
>> could become "Quality Assurance of upstream code and running
>> OpenStack installations". If not we need to find some other program
>> for rally.
>
> I think that's realistically already the scope of the QA program, we
> might just need to change the governance wording.
>
> Tempest has always been intended to be run on production clouds
> (public or private) to ensure proper function. Many operators are
> doing this today as part of normal health management. And we
> continue to evolve it to be something which works well in that
> environment.
>
> All the statistics collection / analysis parts in Rally today I think
> are basically things that should be part of any Tempest installation
> / run. It's cool that Rally did a bunch of work there, but having
> that code outside of Tempest is sort of problematic, especially as
> there are huge issues with the collection of that data because of
> missing timing information in subunit. So realistically to get
> accurate results there needs to be additional events added into
> Tempest tests to build this correctly. If you stare at the raw
> results here today they have such huge accuracy problems (due to
> unaccounted for time in setupClass, which is a known problem) to the
> point of being misleading, and possibly actually harmful.
>
> These are things that are fixable, but hard to do outside of the
> Tempest project itself. Exporting accurate timing / stats should be
> a feature close to the test load, not something that's done
> externally with guessing and fudge factors.
>
> So every time I look at the docs in Rally -
> https://github.com/stackforge/rally I see largely features that
> should be coming out of the test runners themselves.
>
> -Sean
>
> -- Sean Dague http://dague.net
>
>
> _______________________________________________ OpenStack-dev
> mailing list OpenStack-dev at lists.openstack.org
> <mailto:OpenStack-dev at lists.openstack.org>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> _______________________________________________ OpenStack-dev
> mailing list OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
More information about the OpenStack-dev
mailing list