[Openstack-operators] nova-api response time

Matt Riedemann mriedemos at gmail.com
Mon May 8 02:58:23 UTC 2017


On 5/5/2017 9:00 AM, Vladimir Prokofev wrote:
> Writing things down always helps.
> So I dug a little further, and seems that keystone is the weakest link.
> It appears that it takes 500-600ms to generate a token, similar time to
> validate it. This means that first time application accesses the API
> request will run at least 1000ms, or even more. Second time will be
> significantly faster because you no longer need to request a token, and
> validation is done via memcached.
> But for some reason cached token expires very fast, in a matter of
> minutes(I didn't actually measured exact time), so at some point
> validation will once again take half a second.
> If your app written in a bad way, i.e. for every request it creates a
> new auth session, this means that it will run very very slowly, as every
> request will take half a second to get a token.
>
> So now it's a question of determining why keystone is so slow? Btw, I
> use keystone 10.0.0 with fernet tokens and apache WSGI(5 processes, 1
> thread) on three controller nodes behind haproxy.
> I found this article[1] about keystone benchmarking, and it seems that
> it can run significantly faster.
>
> So once again:
>  - what are your average times for token creation/validation?
>  - is this procedure computationally extensive and requires faster
> CPU(fernet implies encryption/decription)?
>  - appreciate a link on a good doc on how to check and increases
> memcached caching timers.
>
> By the way, I think this topic now should be named "openstack API
> response times in general", rather than strictly tie to nova-api.
>
> [1] https://docs.openstack.org/developer/rally/overview/stories/keystone/authenticate.html
>
> 2017-05-05 13:38 GMT+03:00 Vladimir Prokofev <v at prokofev.me
> <mailto:v at prokofev.me>>:
>
>     Hello Ops.
>
>     So I had a feeling that my Openstack API is running a bit slow, like
>     it takes a couple of seconds to get a full list of all instances in
>     Horizon for example, which is kind of a lot for a basic MySQL
>     select-and-present job. This got me wondering to how can I measure
>     execution times and find out how long it takes for any part of the
>     API request to execute.
>
>     So I got a first REST client I could find(it happens to be a Google
>     Chrome application called ARC(Advanced REST Client)), decided how
>     and what I want to measure and got to it.
>
>     I'm doing a basic /servers request to public nova-api endpoint using
>     X-Auth-Token I got earlier from keystone. Project that I pull has
>     only single instance in it.
>     From clients perspective there're 4 major parts of request:
>     connection time, send time, wait time, and receive time.
>     Connection time, send time, and receive time are all negligible, as
>     they take about 5-10ms combined. The major part is wait time, which
>     can take from 130ms to 2500ms.
>     Wait time is directly correlated to the execution time that I can
>     see in nova-api log, with about 5-10ms added lag, which is expected
>     and perfectly normal as I understand it.
>
>     First run after some time is always the slowest one. and takes from
>     900ms to 2500ms. If I do the same request in the next few minutes it
>     will be faster - 130 to 200ms. My guess is that there's some caching
>     involved(memcached?).
>
>     So my questions are:
>      - what execution times do you usually see in your environment for
>     nova-api, or other APIs for that matter? I have a feeling that even
>     130ms is very slow for such a simple task;
>      - I assumed that first request is so slow because it pulls data
>     from MySQL which resides on slow storage(3 node Galera cluster, each
>     node is a KVM domain based on 2 SATA 7200 HDD in software RAID1), so
>     I enabled slow query logs for MySQL with 500ms threshold, but I see
>     nothing, even when wait time > 2000ms. Seems MySQL has nothing to do
>     with it;
>      - one more idea is that it takes a lot of time for keystone to
>     validate a token. With expired token I get 401 response in 500-600ms
>     first time, and 30-40ms next time I do a request;
>      - another idea about slow storage is that nova-api has to write
>     something(a log for example), before sending response, but I don't
>     know how to test this exactly, rather that just transfer my
>     controller nodes on an SSD;
>      - is there a way I can profile nova-api, and see exactly which
>     method takes the most time?
>
>     I understand that I didn't exactly described my setup. This is due
>     to the fact that I'm more interested in learning how to profile API
>     components, rather than just plainly fix my setup(e.g. make it
>     faster). In the end as I see it it's either a storage related
>     issue(fixed by transferring controllers and database to a fast
>     storage like SSD), or computing power issue(fixed by placing
>     controllers on a dedicated bare-metal with fast CPU/memory). I have
>     a desire not to guess, but to know exactly which is it :)
>     I will describe setup in detail if such request is made.
>
>
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>

I can't answer your questions about Keystone and tokens but you asked a 
question about ways (tools) to profile the time spent in a nova-api 
request to list servers, and Nova has supported the osprofiler tool [1] 
since the Newton release. That might help you out with profiling.

[1] https://docs.openstack.org/developer/osprofiler/

-- 

Thanks,

Matt



More information about the OpenStack-operators mailing list