[Openstack] [Keystone] performance issues after havana upgrade

Morgan Fainberg m at metacloud.com
Sun Jan 12 03:57:24 UTC 2014


Sounds good!  Just remember that prior to the fix I posted there, for each token in the user’s index, it incurred a round-trip to memcached to validate the token wasn’t expired.  This change makes it so that there are significantly less trips from keystone to memcached.

If this doesn’t 100% solve the issue, we should start digging further into what is going on, but I am confident this will (at the very least) help a reasonable amount.

—Morgan 

On January 11, 2014 at 19:04:59, Jonathan Proulx (jon at jonproulx.com) wrote:

On Sat, Jan 11, 2014 at 8:24 PM, Morgan Fainberg <m at metacloud.com> wrote:  
> Hi Jon,  
>  
> I have published a patch set that I hope will help to address this issue:  
> https://review.openstack.org/#/c/66149/ . If you need this in another  
> format, please let me know.  

That's a fine format, also I love patches that only touch one file (plus tests).  

> The only caveat is that you should expect the maximum number of tokens  
> per-user to drop by as much as 50% due to storing extra data about the token  
> (avoiding the need to ask memcache about each token on issuance of a new  
> one). I still recommend keeping the expires time down to ~3600 seconds or  
> so. I also recommend doing a flush (like you did before) when deploying  
> this change.  
>  
> Let me know if this helps (or if you have any issues with it). Feel free to  
> respond via email or comment on the review. Disclaimer: I have not  
> performed functional performance tests on this code, just some initial  
> cleanup and change of logic that should help minimize external calls.  

I'll try it out tomorrow and let you know. The memcache item counts at  
failure are 1/32nd to 1/16th what you were suggesting as an upper  
bound, so I'm not sure this is quite the issue I'm having, but I hope  
it is  

By juggling a variety of things I've managed to stablize things at a  
poor but functional stage . I have keystone running behind apache with  
12 processes set though well fewer than that seem to actually run at  
any given time. I needed to bring the ttl down to 600s, which makes  
the dashboard hard to use :) It's difficult to draw conclusions from  
this comparison, but right after flushing and restarting 'nova list'  
on a small tenant takes about 1.5sec to return, at the stable state  
(about 1 ttl later) this same operation takes on the order of 30sec  
(25-40). At this point memcache only reports just over 1k items,  
which doesn't seem like a lot to me (at 3600s ttl it gets upto about  
2.5k but things stop working so I'd guess it locks up around 1800s but  
I've not full explored the range. .  

Thanks,  
-Jon  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20140111/fdad5af8/attachment.html>


More information about the Openstack mailing list