[openstack-dev] Which program for Rally
sean at dague.net
Thu Aug 7 12:36:00 UTC 2014
On 08/07/2014 07:31 AM, Rohan Kanade wrote:
> Date: Wed, 06 Aug 2014 09:44:12 -0400
> From: Sean Dague <sean at dague.net <mailto:sean at dague.net>>
> To: openstack-dev at lists.openstack.org
> <mailto:openstack-dev at lists.openstack.org>
> Subject: Re: [openstack-dev] Which program for Rally
> Message-ID: <53E2312C.8000309 at dague.net
> <mailto:53E2312C.8000309 at dague.net>>
> Content-Type: text/plain; charset=utf-8
> Like the fact that right now the rally team is proposing gate jobs which
> have some overlap to the existing largeops jobs. Did they start a
> conversation about it? Nope. They just went off to do their thing
> instead. https://review.openstack.org/#/c/112251/
> Hi Sean,
> Appreciate your analysis
> Here is a comparison of the tempest largeops job and similar in Rally.
> What large-ops job provides as of now:
> Running hard-coded configured benchmarks (in gates), that are taken from
> tempest repo.
> eg: "run 100 vms by one request". End result is a +1 or -1 which doesnt
> really reflect much in terms of the performance stats and regressions in
That's true, it's a very coarse grained benchmark. It's specifically
that way to catch and block regressions.
> What Rally job provides:
> (example in glance:
> 1) Projects can specify which benchmarks to run:
> 2) Projects can specify conditions of passing and inputs to benchmarks
> (e.g. there is no iteration of benchmark failed and average duration of
> iteration is less then X)
> 3) Projects can create any number of benchmarks inside their source tree
> (so they don't need to merge anything to rally)
> 4) Users are getting automated reports of all benchmarks:
This is pretty, but I'm not sure how it provides me with information
about whether or not that was a good change to merge. There is a reason
that we reduced this to a decision in largeops to block a change when we
knew the regression exceeded a threshold we were comfortable with in a
very specific context.
> 5) Users can easily install Rally (with this script
> and test benchmark locally, using the same benchmark configuration as in
> 6) Rally jobs (benchmarks) give you capabilities to check for SLAs in
> your gates itself, which helps immensly to gauge impact of your proposed
> change on the current code in terms of performance and SLA
> Basically with Rally job, one can benchmark changes and compare them
> with master in gates.
> Using below approach:
> 1) Put patch set 1 that changes rally benchmark configuration and
> probably adds some benchmark
> Get base results
> 2) Put patch set2 that includes point 1 + changes that fixes issue
> Get new results
> 3) Compare results and if new results are better push patch set 3 that
> removes changes in task and merge it.
How does Rally account for node variability in the cloud (both between
nodes, between clouds, between times of day)?
How does it provide the user with comparison between local runs and gate
runs to know if things got better or worse?
Getting numbers is step 1, but making those numbers something which
actually can be believed to be impacted directly by your change, vs.
changed by unrelated items, is something which is really important.
We've had turbo-hipster providing feedback in the gate for a while on
benchmark data for the db migrations, and it's false negative rate is
actually quite high for many of these same reasons. Adding more false
fails in the gate is something I think we should avoid.
More information about the OpenStack-dev