Open Stack

Fri Mar 1 22:59:30 UTC 2013

On 03/01/2013 01:18 PM, Vishvananda Ishaya wrote:
> Hi Everyone,
> 
> So I've been doing some profiling of api calls against devstack and I've discovered that a significant portion of time spent is in the auth_token middleware validating the PKI token. There is code to turn on caching of the token if memcache is enabled, but this seems like overkill in most cases. We should be caching the token in memory by default. Fortunately, nova has some nifty code that will use an in-memory cache if memcached isn't available.

We gave up on PKI in Folsom after weeks of trouble with it:

* Unstable -- Endpoints would stay up >24 hours but after around 24
hours (sometimes sooner), the endpoint would stop working properly with
the server user suddenly returned a 401 when trying to authenticate a
token. Restarting the endpoint with a service nova-api restart gets rid
of the 401 Unauthorized for a few hours and then it happens again.

* Unable to use memcache with PKI. The token was longer than the maximum
memcache key and resulted in errors on every request. The solution for
this was to hash the CMS token and use hash as a key in memcache, but
unfortunately this solution wasn't backported to Folsom Keystone --
partly I think because the auth_token middleware was split out into the
keystoneclient during Grizzly.

In any case, the above two things make PKI unusable in Folsom.

We fell back on UUID tokens -- the default in Folsom. Unfortunately,
there are serious performance issues with this approach as well. Every
single request to an endpoint results in multiple requests to Keystone,
which bogs down the system.

In addition to the obvious roundtrip issues, with just 26 users in a
test cloud, in 3 weeks there are over 300K records in the tokens table
on a VERY lightly used cloud. Not good. Luckily, we use multi-master
MySQL replication (Galera) with excellent write rates spread across four
cluster nodes, but this scale of writes for such a small test cluster is
worrying to say the least.

Although not related to PKI, I've also noticed that due to the decision
to use a denormalized schema in the users table with the "extra" column
storing a JSON-encoded blob of data including the user's default tenant
and enabled flag is a horrible performance problem. Hope that v3
Keystone has corrected these issues in the SQL driver.

> 
> 1) Shim the code into the wsgi stack using the configuration options designed for swift:
> 
> https://review.openstack.org/23236
> 
> This is my least favorite option since changing paste config is a pain for deployers and it doesn't help any of the other projects.

Meh, whether you add options to a config file or a paste INI file it's
the same pain for deployers :) But generally agree with you.

> 2) Copy the code into keystoneclient:
> 
> https://review.openstack.org/23307
> 
> 3) Move memorycache into oslo and sync it to nova and keystoneclient:
> 
> https://review.openstack.org/23306
> https://review.openstack.org/23308
> https://review.openstack.org/23309
> 
> I think 3) is the right long term move, but I'm not sure if this appropriate considering how close we are to the grizzly release, so if we want to do 2) immediately and postpone 3) until H, that is fine with me.

Well, I think 3) is the right thing to do in any case, and can be done
in oslo regardless of Nova's RC status.

Not sure that 2) is really all that useful. If you are in any serious
production environment, you're going to be using memcached anyway.

Best,
-jay

> Thoughts?
> 
> Vish
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

Open Stack

[openstack-dev] [Keystone][Oslo] Caching tokens in auth token middleware

OpenStack

Community

Documentation

Branding & Legal