[Openstack-operators] nova-api response time

Vladimir Prokofev v at prokofev.me
Fri May 5 14:00:39 UTC 2017


Writing things down always helps.
So I dug a little further, and seems that keystone is the weakest link.
It appears that it takes 500-600ms to generate a token, similar time to
validate it. This means that first time application accesses the API
request will run at least 1000ms, or even more. Second time will be
significantly faster because you no longer need to request a token, and
validation is done via memcached.
But for some reason cached token expires very fast, in a matter of
minutes(I didn't actually measured exact time), so at some point validation
will once again take half a second.
If your app written in a bad way, i.e. for every request it creates a new
auth session, this means that it will run very very slowly, as every
request will take half a second to get a token.

So now it's a question of determining why keystone is so slow? Btw, I use
keystone 10.0.0 with fernet tokens and apache WSGI(5 processes, 1 thread)
on three controller nodes behind haproxy.
I found this article[1] about keystone benchmarking, and it seems that it
can run significantly faster.

So once again:
 - what are your average times for token creation/validation?
 - is this procedure computationally extensive and requires faster
CPU(fernet implies encryption/decription)?
 - appreciate a link on a good doc on how to check and increases memcached
caching timers.

By the way, I think this topic now should be named "openstack API response
times in general", rather than strictly tie to nova-api.

[1]
https://docs.openstack.org/developer/rally/overview/stories/keystone/authenticate.html

2017-05-05 13:38 GMT+03:00 Vladimir Prokofev <v at prokofev.me>:

> Hello Ops.
>
> So I had a feeling that my Openstack API is running a bit slow, like it
> takes a couple of seconds to get a full list of all instances in Horizon
> for example, which is kind of a lot for a basic MySQL select-and-present
> job. This got me wondering to how can I measure execution times and find
> out how long it takes for any part of the API request to execute.
>
> So I got a first REST client I could find(it happens to be a Google Chrome
> application called ARC(Advanced REST Client)), decided how and what I want
> to measure and got to it.
>
> I'm doing a basic /servers request to public nova-api endpoint using
> X-Auth-Token I got earlier from keystone. Project that I pull has only
> single instance in it.
> From clients perspective there're 4 major parts of request: connection
> time, send time, wait time, and receive time.
> Connection time, send time, and receive time are all negligible, as they
> take about 5-10ms combined. The major part is wait time, which can take
> from 130ms to 2500ms.
> Wait time is directly correlated to the execution time that I can see in
> nova-api log, with about 5-10ms added lag, which is expected and perfectly
> normal as I understand it.
>
> First run after some time is always the slowest one. and takes from 900ms
> to 2500ms. If I do the same request in the next few minutes it will be
> faster - 130 to 200ms. My guess is that there's some caching
> involved(memcached?).
>
> So my questions are:
>  - what execution times do you usually see in your environment for
> nova-api, or other APIs for that matter? I have a feeling that even 130ms
> is very slow for such a simple task;
>  - I assumed that first request is so slow because it pulls data from
> MySQL which resides on slow storage(3 node Galera cluster, each node is a
> KVM domain based on 2 SATA 7200 HDD in software RAID1), so I enabled slow
> query logs for MySQL with 500ms threshold, but I see nothing, even when
> wait time > 2000ms. Seems MySQL has nothing to do with it;
>  - one more idea is that it takes a lot of time for keystone to validate a
> token. With expired token I get 401 response in 500-600ms first time, and
> 30-40ms next time I do a request;
>  - another idea about slow storage is that nova-api has to write
> something(a log for example), before sending response, but I don't know how
> to test this exactly, rather that just transfer my controller nodes on an
> SSD;
>  - is there a way I can profile nova-api, and see exactly which method
> takes the most time?
>
> I understand that I didn't exactly described my setup. This is due to the
> fact that I'm more interested in learning how to profile API components,
> rather than just plainly fix my setup(e.g. make it faster). In the end as I
> see it it's either a storage related issue(fixed by transferring
> controllers and database to a fast storage like SSD), or computing power
> issue(fixed by placing controllers on a dedicated bare-metal with fast
> CPU/memory). I have a desire not to guess, but to know exactly which is it
> :)
> I will describe setup in detail if such request is made.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20170505/2635ea84/attachment.html>


More information about the OpenStack-operators mailing list