[openstack-dev] [tc][rally][qa] Application for a new OpenStack Program: Performance and Scalability
    marc at koderer.com 
    marc at koderer.com
       
    Tue Aug  5 06:38:30 UTC 2014
    
    
  
Hello Boris,
see below.
Zitat von Boris Pavlovic <boris at pavlovic.me>:
> Jay,
>
> Thanks for review of proposal. Some my comments below..
>
>
> I think this is one of the roots of the problem that folks like David
>> and Sean keep coming around to. If Rally were less monolithic, it
>> would be easier to say "OK, bring this piece into Tempest, have this
>> piece be a separate library and live in the QA program, and have the
>> service endpoint that allows operators to store and periodically measure
>> SLA performance indicators against their cloud."
>
> Actually Rally was designed to be a glue service (and cli tool) that will
> bind everything together and present service endpoint for Operators. I
> really do not understand what can be split? and put to tempest? and
> actually why? Could you elaborate pointing on current Rally code, maybe
> there is some misleading here. I think this should be discussed in more
> details..
>
A good example for that is Rally's Tempest configuration module. Currently
Rally has all the logic to configure Tempest and for that you have your
own way to build the tempest conf out of a template [1]. If the QA
team decides to rework the configuration Rally is broken.
[1]:  
https://github.com/stackforge/rally/blob/master/rally/verification/verifiers/tempest/config.ini
[snip]
> I found the Scalr incubation discussion:
> http://eavesdrop.openstack.org/meetings/openstack-meeting/2011/openstack-meeting.2011-06-14-20.03.log.html
>
> The reasons of reject were next:
> *) OpenStack shouldn't put PaaS in OpenStack core # rally is not PaaS
> *) Duplication of functionality (actually dashboard)  # Rally doesn't
> duplicate anything
IMHO rally duplicates at least some pieces. So you can find parts of
Tempest scenarios tests in the benchmarks area, Tempest stress tests
and Tempest config.
Regards
Marc
> *) Development is done behind closed doors
> # Not about Rally
> http://stackalytics.com/?release=juno&metric=commits&project_type=All&module=rally
>
> Seems like Rally is quite different case and this comparison is misleading
> & irrelevant to current case.
>
>
>
>> , that is why I think Rally should be a separated program (i.e.
>>> Rally scope is just different from QA scope). As well, It's not clear
>>> for me, why collaboration is possible only in case of one program? In
>>> my opinion collaboration & programs are irrelevant things.
>>
>>
>> Sure, it's certainly possible for collaboration to happen across
>> programs. I think what Sean is alluding to is the fact that the Tempest
>> and Rally communities have done little collaboration to date, and that
>> is worrying to him.
>
>
> Could you please explain this paragraph. What do you mean by "have done
> little collaboration"
>
> We integrated Tempest in Rally:
> http://www.mirantis.com/blog/rally-openstack-tempest-testing-made-simpler/
>
> We are working on spec in Tempest about tempest conf generation:
> https://review.openstack.org/#/c/94473/ # probably not so fast as we would
> like
>
> We had design session:
> http://junodesignsummit.sched.org/event/2815ca60f70466197d3a81d62e1ee7e4#.U9_ugYCSz1g
>
> I am going to work on integration OSprofiler in tempest, as soon as I get
> it in core projects.
>
> By the way, I am really not sure how being one Program will help us to
> collaborate? What it actually changes?
>
>
>
>> About collaboration between Rally & Tempest teams... Major goal of
>>> integration Tempest in Rally is to make it simpler to use tempest on
>>> production clouds via OpenStack API.
>>
>>
> Plenty of folks run Tempest without Rally against production clouds as
>> an acceptance test platform. I see no real benefit to arguing that Rally
>> is for running against production clouds and Tempest is for
>> non-production clouds. There just isn't much of a difference there.
>
>
> Hm, I didn't say anything about "Tempest is for non-prduction clouds"...
> I said that Rally team is working on making it simpler to use on production
> clouds..
>
>
>
> The problem I see is that Rally is not *yet* exposing the REST service
>> endpoint that would make it a full-fledged Operator Tool outside the
>> scope of its current QA focus. Once Rally does indeed publish a REST API
>> that exposes resource endpoints for an operator to store a set of KPIs
>> associated with an SLA, and allows the operator to store the run
>> schedule that Rally would use to go and test such metrics, *then* would
>> be the appropriate time to suggest that Rally be the pilot project in
>> this new Operator Tools program, IMO.
>
>
> It's really almost done.. It is all about 2 weeks of work...
>
>
>
> I'm sure all of those things would be welcome additions to Tempest. At the
>> same time, Rally contributors would do well to work on an initial REST API
>> endpoint that would expose the resources I denoted above.
>
>
> As I said before it's almost finished..
>
>
> Best regards,
> Boris Pavlovic
>
>
>
> On Mon, Aug 4, 2014 at 8:25 PM, Jay Pipes <jaypipes at gmail.com> wrote:
>
>> On 08/04/2014 11:21 AM, Boris Pavlovic wrote:
>>
>>> Rally is quite monolithic and can't be split
>>>
>>
>> I think this is one of the roots of the problem that folks like David
>> and Sean keep coming around to. If Rally were less monolithic, it
>> would be easier to say "OK, bring this piece into Tempest, have this
>> piece be a separate library and live in the QA program, and have the
>> service endpoint that allows operators to store and periodically measure
>> SLA performance indicators against their cloud."
>>
>> Incidentally, this is one of the problems that Scalr faced when applying
>> for incubation, years ago, and one of the reasons the PPB at the time voted
>> not to incubate Scalr: it had a monolithic design that crossed too many
>> different lines in terms of duplicating functionality that already existed
>> in a number of other projects.
>>
>>
>>  , that is why I think Rally should be a separated program (i.e.
>>> Rally scope is just different from QA scope). As well, It's not clear
>>> for me, why collaboration is possible only in case of one program? In
>>> my opinion collaboration & programs are irrelevant things.
>>>
>>
>> Sure, it's certainly possible for collaboration to happen across
>> programs. I think what Sean is alluding to is the fact that the Tempest
>> and Rally communities have done little collaboration to date, and that
>> is worrying to him.
>>
>>
>>  About collaboration between Rally & Tempest teams... Major goal of
>>> integration Tempest in Rally is to make it simpler to use tempest on
>>> production clouds via OpenStack API.
>>>
>>
>> Plenty of folks run Tempest without Rally against production clouds as
>> an acceptance test platform. I see no real benefit to arguing that Rally
>> is for running against production clouds and Tempest is for
>> non-production clouds. There just isn't much of a difference there.
>>
>> That said, an Operator Tools program is actually an entirely different
>> concept -- with a different audience and mission from the QA program. I
>> think you've seen here some initial support for such a proposed Operator
>> Tools program.
>>
>> The problem I see is that Rally is not *yet* exposing the REST service
>> endpoint that would make it a full-fledged Operator Tool outside the
>> scope of its current QA focus. Once Rally does indeed publish a REST API
>> that exposes resource endpoints for an operator to store a set of KPIs
>> associated with an SLA, and allows the operator to store the run
>> schedule that Rally would use to go and test such metrics, *then* would
>> be the appropriate time to suggest that Rally be the pilot project in
>> this new Operator Tools program, IMO.
>>
>>
>>  This work requires a lot of collaboration between teams, as you
>>> already mention we should work on improving measuring durations and
>>> tempest.conf generation. I fully agree that this belongs to Tempest.
>>> By the way, Rally team is already helping with this part.
>>>
>>> In my opinion, end result should be something like: Rally just calls
>>> Tempest (or couple of scripts from tempest) and store results to its
>>> DB, presenting to end user tempest functionality via OpenStack API.
>>> To get this done, we should implement next features in tempest: 1)
>>> Auto  tempest.conf generation 2) Production ready cleanup  - tempest
>>> should be absolutely safe for run against cloud 3) Improvements
>>> related to time measurement. 4) Integration of OSprofiler & Tempest.
>>>
>>
>> I'm sure all of those things would be welcome additions to Tempest. At the
>> same time, Rally contributors would do well to work on an initial REST API
>> endpoint that would expose the resources I denoted above.
>>
>> Best,
>> -jay
>>
>>  So in any case I would prefer to continue collaboration..
>>>
>>> Thoughts?
>>>
>>>
>>> Best regards, Boris Pavlovic
>>>
>>>
>>>
>>>
>>> On Mon, Aug 4, 2014 at 4:24 PM, Sean Dague <sean at dague.net
>>> <mailto:sean at dague.net>> wrote:
>>>
>>> On 07/31/2014 06:55 AM, Angus Salkeld wrote:
>>>
>>>> On Sun, 2014-07-27 at 07:57 -0700, Sean Dague wrote:
>>>>
>>>>> On 07/26/2014 05:51 PM, Hayes, Graham wrote:
>>>>>
>>>>>> On Tue, 2014-07-22 at 12:18 -0400, Sean Dague wrote:
>>>>>>
>>>>>>> On 07/22/2014 11:58 AM, David Kranz wrote:
>>>>>>>
>>>>>>>> On 07/22/2014 10:44 AM, Sean Dague wrote:
>>>>>>>>
>>>>>>>>> Honestly, I'm really not sure I see this as a different
>>>>>>>>>
>>>>>>>> program, but is
>>>
>>>> really something that should be folded into the QA
>>>>>>>>> program.
>>>>>>>>>
>>>>>>>> I feel like
>>>
>>>> a top level effort like this is going to lead to a lot
>>>>>>>>> of
>>>>>>>>>
>>>>>>>> duplication in
>>>
>>>> the data analysis that's currently going on, as well as
>>>>>>>>>
>>>>>>>> functionality
>>>
>>>> for better load driver UX.
>>>>>>>>>
>>>>>>>>> -Sean
>>>>>>>>>
>>>>>>>> +1 It will also lead to pointless discussions/arguments
>>>>>>>> about which activities are part of "QA" and which are part
>>>>>>>> of "Performance and Scalability Testing".
>>>>>>>>
>>>>>>>
>>>>>> I think that those discussions will still take place, it will
>>>>>>
>>>>> just be on
>>>
>>>> a per repository basis, instead of a per program one.
>>>>>>
>>>>>> [snip]
>>>>>>
>>>>>>
>>>>>>> Right, 100% agreed. Rally would remain with it's own repo +
>>>>>>>
>>>>>> review team,
>>>
>>>> just like grenade.
>>>>>>>
>>>>>>> -Sean
>>>>>>>
>>>>>>>
>>>>>> Is the concept of a separate review team not the point of a
>>>>>>
>>>>> program?
>>>
>>>>
>>>>>> In the the thread from Designate's Incubation request Thierry
>>>>>>
>>>>> said [1]:
>>>
>>>>
>>>>>>  "Programs" just let us bless goals and teams and let them
>>>>>>> organize code however they want, with contribution to any
>>>>>>> code repo
>>>>>>>
>>>>>> under that
>>>
>>>> umbrella being considered "official" and
>>>>>>> ATC-status-granting.
>>>>>>>
>>>>>>
>>>>>> I do think that this is something that needs to be clarified
>>>>>> by
>>>>>>
>>>>> the TC -
>>>
>>>> Rally could not get a PTL if they were part of the QA project,
>>>>>>
>>>>> but every
>>>
>>>> time we get a program request, the same discussion happens.
>>>>>>
>>>>>> I think that mission statements can be edited to fit new
>>>>>>
>>>>> programs as
>>>
>>>> they occur, and that it is more important to let teams that
>>>>>>
>>>>> have been
>>>
>>>> working closely together to stay as a distinct group.
>>>>>>
>>>>>
>>>>> My big concern here is that many of the things that these
>>>>>
>>>> efforts have
>>>
>>>> been doing are things we actually want much closer to the base.
>>>>> For instance, metrics on Tempest runs.
>>>>>
>>>>> When Rally was first created it had it's own load generator. It
>>>>>
>>>> took a
>>>
>>>> ton of effort to keep the team from duplicating that and instead
>>>>>
>>>> just
>>>
>>>> use some subset of Tempest. Then when measuring showed up, we
>>>>>
>>>> actually
>>>
>>>> said that is something that would be great in Tempest, so
>>>>>
>>>> whoever ran
>>>
>>>> it, be it for Testing, Monitoring, or Performance gathering,
>>>>>
>>>> would have
>>>
>>>> access to that data. But the Rally team went off in a corner and
>>>>>
>>>> did it
>>>
>>>> otherwise. That's caused the QA team to have to go and redo this
>>>>>
>>>> work
>>>
>>>> from scratch with subunit2sql, in a way that can be consumed by
>>>>>
>>>> multiple
>>>
>>>> efforts.
>>>>>
>>>>> So I'm generally -1 to this being a separate effort on the basis
>>>>>
>>>> that so
>>>
>>>> far the team has decided to stay in their own sandbox instead of
>>>>>  participating actively where many of us thing the functions
>>>>>
>>>> should be
>>>
>>>> added. I also think this isn't like Designate, because this isn't
>>>>> intended to be part of the integrated release.
>>>>>
>>>>
>>>> From reading Boris's email it seems like rally will provide a
>>>> horizon panel and api to back it (for the operator to kick of
>>>> performance
>>>>
>>> runs
>>>
>>>> and view stats). So this does seem like something that would be a
>>>> part of the integrated release (if I am reading things correctly).
>>>>
>>>> Is the QA program happy to extend their scope to include that? QA
>>>> could become "Quality Assurance of upstream code and running
>>>> OpenStack installations". If not we need to find some other program
>>>> for rally.
>>>>
>>>
>>> I think that's realistically already the scope of the QA program, we
>>>  might just need to change the governance wording.
>>>
>>> Tempest has always been intended to be run on production clouds
>>> (public or private) to ensure proper function. Many operators are
>>> doing this today as part of normal health management. And we
>>> continue to evolve it to be something which works well in that
>>> environment.
>>>
>>> All the statistics collection / analysis parts in Rally today I think
>>> are basically things that should be part of any Tempest installation
>>> / run. It's cool that Rally did a bunch of work there, but having
>>> that code outside of Tempest is sort of problematic, especially as
>>> there are huge issues with the collection of that data because of
>>> missing timing information in subunit. So realistically to get
>>> accurate results there needs to be additional events added into
>>> Tempest tests to build this correctly. If you stare at the raw
>>> results here today they have such huge accuracy problems (due to
>>> unaccounted for time in setupClass, which is a known problem) to the
>>> point of being misleading, and possibly actually harmful.
>>>
>>> These are things that are fixable, but hard to do outside of the
>>> Tempest project itself. Exporting accurate timing / stats should be
>>> a feature close to the test load, not something that's done
>>> externally with guessing and fudge factors.
>>>
>>> So every time I look at the docs in Rally -
>>> https://github.com/stackforge/rally I see largely features that
>>> should be coming out of the test runners themselves.
>>>
>>> -Sean
>>>
>>> -- Sean Dague http://dague.net
>>>
>>>
>>> _______________________________________________ OpenStack-dev
>>> mailing list OpenStack-dev at lists.openstack.org
>>> <mailto:OpenStack-dev at lists.openstack.org>
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________ OpenStack-dev
>>> mailing list OpenStack-dev at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
    
    
More information about the OpenStack-dev
mailing list