Open Stack

Fri Jun 3 20:12:55 UTC 2016

On Fri, Jun 03, 2016 at 01:53:16PM -0600, Matt Fischer wrote:
> On Fri, Jun 3, 2016 at 1:35 PM, Lance Bragstad <lbragstad at gmail.com> wrote:
> 
> > Hey all,
> >
> > I have been curious about impact of providing performance feedback as part
> > of the review process. From what I understand, keystone used to have a
> > performance job that would run against proposed patches (I've only heard
> > about it so someone else will have to keep me honest about its timeframe),
> > but it sounds like it wasn't valued.
> >
> > I think revisiting this topic is valuable, but it raises a series of
> > questions.
> >
> > Initially it probably only makes sense to test a reasonable set of
> > defaults. What do we want these defaults to be? Should they be determined
> > by DevStack, openstack-ansible, or something else?
> >
> > What does the performance test criteria look like and where does it live?
> > Does it just consist of running tempest?
> >
> 
> Keystone especially has some calls that are used 1000x or more relative to
> others and so I'd be more concerned about them. For me this is token
> validation #1 and token creation #2. Tempest checks them of course but
> might be too coarse? There are token benchmarks like the ones Dolph and I
> use, they are don't mimic a real work flow.  Something to consider.
> 
> 
> 
> >
> > From a contributor and reviewer perspective, it would be nice to have the
> > ability to compare performance results across patch sets. I understand that
> > keeping all performance results for every patch for an extended period of
> > time is unrealistic. Maybe we take a daily performance snapshot against
> > master and use that to map performance patterns over time?
> >
> 
> Having some time series data captured would be super useful. Could we have
> daily charts stored indefinitely?

We are already doing this to a certain extent with results from the gate using
subunit2sql and openstack-health. I pointed Lance to this on IRC as an example:

http://status.openstack.org/openstack-health/#/test/tempest.api.identity.v3.test_tokens.TokensV3Test.test_create_token

Which is showing all the execute times for tempest's V3 test_create_token for
all runs in the gate and periodic queues (resampled to the hour) This is all
done automatically for everything that emits subunit. (not just tempest jobs)
We're storing 6 months of data in the DB right now.

FWIW, I've written some blog posts about this and how to interact with it:

http://blog.kortar.org/?p=212

and

http://blog.kortar.org/?p=279

(although some of the info is a bit dated)

The issue with doing this in the the gate though is it's inherently noisy given
that we're running everything in guests on multiple different public clouds.
It's impossible to get any consistency in results to do any useful benchmarking
when looking at a single change. (or even a small group of changes) A good
example of this are some of tempest's scenario tests, like:

http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_volume_boot_pattern.TestVolumeBootPatternV2.test_volume_boot_pattern

which shows a normal variance of about 60sec between runs.

-Matt Treinish

> 
> 
> 
> >
> > Have any other projects implemented a similar workflow?
> >
> > I'm open to suggestions and discussions because I can't imagine there
> > aren't other folks out there interested in this type of pre-merge data
> > points.
> >
> > Thanks!
> >
> > Lance
> >
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160603/b5ee7870/attachment.pgp>

Open Stack

[openstack-dev] [keystone][all] Incorporating performance feedback into the review process

OpenStack

Community

Documentation

Branding & Legal