[openstack-dev] [keystone][all] Incorporating performance feedback into the review process

Morgan Fainberg morgan.fainberg at gmail.com
Fri Jun 10 22:58:42 UTC 2016

On Fri, Jun 10, 2016 at 3:26 PM, Lance Bragstad <lbragstad at gmail.com> wrote:

>    1. I care about performance. I just believe that a big hurdle has been
>    finding infrastructure that allows us to run performance tests in a
>    consistent manner. Dedicated infrastructure plays a big role in this,
>     which is hard (if not impossible) to obtain in the gate - making the gate
>    a suboptimal place for performance testing. Consistency is also an issue
>    because the gate is comprised of resources donated from several different
>    providers. Matt lays this out pretty well in his reply above. This sounds
>    like a TODO to hook rally into the keystone-performance/ansible pipeline,
>    then we would have rally and keystone running on bare metal.
> This was one of the BIGGEST reasons rally was not given much credence in
keystone. The wild variations made the rally data mostly noise. We can't
even tell if the data from similar nodes (same provider/same az) was
available. This made it a best guess effort of "is this an issue with a
node being slow, or the patch" at the time the gate was enabled. This is
also why I wouldn't support re-enabling rally as an in-infra gate/check
job. The data was extremely difficult to consume as a developer because I'd
have to either directly run rally here locally (fine, but why waste infra
resources then?) or try and correlate data from across different patches
and different AZ providers. It's great to see this being addressed here.

>    1.
>    2. See response to #5.
>    3. What were the changes made to keystone that caused rally to fail?
>    If you have some links I'd be curious to revisit them and improve them if I
>    can.
> When there were failures, the failures were both not looked at by the
Rally team and was not performance reasons at the time, it was rally not
able to be setup/run at all.

>    1. Blocked because changes weren't reviewed? As far as I know
>    OSProfiler is in keystone's default pipeline.
> OSProfiler etc had security concerns and issues that were basically left
in "review state" after being given clear "do X to have it approved". I
want to point out that once the performance team came back and addressed
the issues we landed support for OSProfiler, and it is in keystone. It is
not enabled by default (profiling should be opt in, and I stand by that),
but you are correct we landed it.

>    1. It doesn't look like there are any open patches for rally
>    integration with keystone [0]. The closed ones have either been
>    merged [1][2][3][4] or abandon [5][6][7][8] because they are
>    work-in-progress or unattended.
> I'm only looking for this bot to leave a comment. I don't intend on it
> being a voting job any time soon, it's just providing a datapoint for
> patches that we suspect to have an impact on performance. It's running on
> dedicated hardware, but only from a single service provider - so mileage
> may vary depending on where and how you run keystone. But, it does take us
> a step in the right direction. People don't have to use it if they don't
> want to.
I'm super happy to see a consistent report leaving data about performance,
specifically in a consistent environment that isn't going to vary massively
between runs (hopefully). Longterm I'd like to also see this [if it isn't
already] do a delta-over-time of keystone performance on merged patches, so
we can see the timeline of performance.

> Thanks for the feedback!
> [0]
> https://review.openstack.org/#/q/project:openstack/keystone+message:%22%255E%2540rally%2540%22
> [1] https://review.openstack.org/#/c/240251/
> [2] https://review.openstack.org/#/c/188457/
> [3] https://review.openstack.org/#/c/188352/
> [4] https://review.openstack.org/#/c/90405/
> [5] https://review.openstack.org/#/c/301367/
> [6] https://review.openstack.org/#/c/188479/
> [7] https://review.openstack.org/#/c/98836/
> [8] https://review.openstack.org/#/c/91677/
Great work lance!


> On Fri, Jun 10, 2016 at 4:26 PM, Boris Pavlovic <boris at pavlovic.me> wrote:
>> Lance,
>> It is amazing effort, I am wishing you good luck with Keystone team,
>> however i faced some issues when I started similar effort
>> about 3 years ago with Rally. Here are some points, that are going to be
>> very useful for you:
>>    1. I think that Keystone team doesn't care about performance &
>>    scalability at all
>>    2. Keystone team ignored/discard all help from Rally team to make
>>    this effort successful
>>    3. When Rally job started failing, because of introduced performance
>>    issues in Keystone, they decided to remove job
>>    4. They blocked almost forever work on OSProfiler so we are blind and
>>    can't see where is the issue in code
>>    5. They didn't help to develop any Rally plugin or even review the
>>    Rally test cases that we proposed to them
>> Best regards,
>> Boris Pavlovic
>> On Mon, Jun 6, 2016 at 10:45 AM, Clint Byrum <clint at fewbar.com> wrote:
>>> Excerpts from Brant Knudson's message of 2016-06-03 15:16:20 -0500:
>>> > On Fri, Jun 3, 2016 at 2:35 PM, Lance Bragstad <lbragstad at gmail.com>
>>> wrote:
>>> >
>>> > > Hey all,
>>> > >
>>> > > I have been curious about impact of providing performance feedback
>>> as part
>>> > > of the review process. From what I understand, keystone used to have
>>> a
>>> > > performance job that would run against proposed patches (I've only
>>> heard
>>> > > about it so someone else will have to keep me honest about its
>>> timeframe),
>>> > > but it sounds like it wasn't valued.
>>> > >
>>> > >
>>> > We had a job running rally for a year (I think) that nobody ever
>>> looked at
>>> > so we decided it was a waste and stopped running it.
>>> >
>>> > > I think revisiting this topic is valuable, but it raises a series of
>>> > > questions.
>>> > >
>>> > > Initially it probably only makes sense to test a reasonable set of
>>> > > defaults. What do we want these defaults to be? Should they be
>>> determined
>>> > > by DevStack, openstack-ansible, or something else?
>>> > >
>>> > >
>>> > A performance test is going to depend on the environment (the machines,
>>> > disks, network, etc), the existing data (tokens, revocations, users,
>>> etc.),
>>> > and the config (fernet, uuid, caching, etc.). If these aren't
>>> consistent
>>> > between runs then the results are not going to be usable. (This is the
>>> > problem with running rally on infra hardware.) If the data isn't
>>> realistic
>>> > (1000s of tokens, etc.) then the results are going to be at best not
>>> useful
>>> > or at worst misleading.
>>> >
>>> That's why I started the counter-inspection spec:
>>> http://specs.openstack.org/openstack/qa-specs/specs/devstack/counter-inspection.html
>>> It just tries to count operations, and graph those. I've, unfortunately,
>>> been pulled off to other things of late, but I do intend to loop back
>>> and hit this hard over the next few months to try and get those graphs.
>>> What we'd get initially is just graphs of how many messages we push
>>> through RabbitMQ, and how many rows/queries/transactions we push through
>>> mysql. We may also want to add counters like how many API requests
>>> happened, and how many retries happen inside the code itself.
>>> There's a _TON_ we can do now to ensure that we know what the trends are
>>> when something gets "slow", so we can look for a gradual "death by 1000
>>> papercuts" trend or a hockey stick that can be tied to a particular
>>> commit.
>>> > What does the performance test criteria look like and where does it
>>> live?
>>> > > Does it just consist of running tempest?
>>> > >
>>> > >
>>> > I don't think tempest is going to give us numbers that we're looking
>>> for
>>> > for performance. I've seen a few scripts and have my own for testing
>>> > performance of token validation, token creation, user creation, etc.
>>> which
>>> > I think will do the exact tests we want and we can get the results
>>> > formatted however we like.
>>> >
>>> Agreed that tempest will only give a limited view. Ideally one would
>>> also test things like "after we've booted 1000 vms, do we end up reading
>>> 1000 more rows, or 1000 * 1000 more rows.
>>> > From a contributor and reviewer perspective, it would be nice to have
>>> the
>>> > > ability to compare performance results across patch sets. I
>>> understand that
>>> > > keeping all performance results for every patch for an extended
>>> period of
>>> > > time is unrealistic. Maybe we take a daily performance snapshot
>>> against
>>> > > master and use that to map performance patterns over time?
>>> > >
>>> > >
>>> > Where are you planning to store the results?
>>> >
>>> Infra has a graphite/statsd cluster which is made for collecting metrics
>>> on tests. It might need to be expanded a bit, but it should be
>>> relatively cheap to do so given the benefit of having some of these
>>> numbers.
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160610/b94bdfad/attachment.html>

More information about the OpenStack-dev mailing list