[openstack-dev] [Keystone] Best way to do something MySQL-specific?

Adam Young ayoung at redhat.com
Thu Jul 11 02:17:24 UTC 2013

On 07/09/2013 07:33 PM, Jay Pipes wrote:
> On 07/08/2013 05:18 PM, Sean Dague wrote:
>> On 07/01/2013 01:35 PM, Clint Byrum wrote:
>>> The way the new keystone-manage command "token_flush" works right now
>>> is quite broken by MySQL and InnoDB's gap locking behavior:
>>> https://bugs.launchpad.net/1188378
>>> Presumably other SQL databases like PostgreSQL will have similar 
>>> problems
>>> with doing massive deletes, but I am less familiar with them.
>>> I am trying to solve this in keystone, and my first attempt is here:
>>> https://review.openstack.org/#/c/32044/
>>> However, MySQL does not support using "LIMIT" in a sub-query that
>>> is feeding an IN() clause, so that approach will not work. Likewise,
>>> sqlalchemy does not support the MySQL specific extension to DELETE 
>>> which
>>> allows it to have a LIMIT clause.
>>> Now, I can do some hacky things, like just deleting all of the expired
>>> tokens from the oldest single second, but that could also potentially
>>> be millions of tokens, and thus, millions of gaps to lock.
>>> So, there is just not one way to work for all databases, and we have to
>>> have a special mode for MySQL.
>>> I was wondering if anybody has suggestions and/or examples of how to do
>>> that with sqlalchemy.
>> Honestly, my answer is typically to ask Jay, he understands a lot of the
>> ways to get SQLA to do the right thing in mysql.
> LOL, /me blushes.
> In this case, I'd propose something like this, which should work fine 
> for any database:
> cutoff = timeutils.utcnow() - 60  # one minute ago...
> # DELETE in 500 record chunks
> q = session.query(
>         TokenModel.id).filter(
>             TokenModel.expires < cutoff)).limit(500)
> while True:
>     results = q.all()
>     if len(results):
>         ids_to_delete = [r[0] for r in results]
>         session.query(TokenModel).filter(
>             TokenModel.id.in_(ids_to_delete)).delete()
>     else:
>         break
> Code not tested, use with caution, YMMV, etc etc...

It seems to me that it would still have the problem described in the 
original post.  Even if you are only deleteing 500 at atime, all of the 
tokens from the original query will be locked...but I guess that the 
"add new" behavior will only have to check against a subset of the 
tokens in the database.

Are we really generating so many tokens that deleting the expired tokens 
once per second is prohibitive?  That points to something else being wrong.

> Best,
> -jay
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

More information about the OpenStack-dev mailing list