[openstack-dev] [Keystone] Best way to do something MySQL-specific?
ayoung at redhat.com
Thu Jul 11 03:26:31 UTC 2013
On 07/10/2013 11:11 PM, Clint Byrum wrote:
> Excerpts from Adam Young's message of 2013-07-10 19:17:24 -0700:
>> On 07/09/2013 07:33 PM, Jay Pipes wrote:
>>> On 07/08/2013 05:18 PM, Sean Dague wrote:
>>>> On 07/01/2013 01:35 PM, Clint Byrum wrote:
>>>>> The way the new keystone-manage command "token_flush" works right now
>>>>> is quite broken by MySQL and InnoDB's gap locking behavior:
>>>>> Presumably other SQL databases like PostgreSQL will have similar
>>>>> with doing massive deletes, but I am less familiar with them.
>>>>> I am trying to solve this in keystone, and my first attempt is here:
>>>>> However, MySQL does not support using "LIMIT" in a sub-query that
>>>>> is feeding an IN() clause, so that approach will not work. Likewise,
>>>>> sqlalchemy does not support the MySQL specific extension to DELETE
>>>>> allows it to have a LIMIT clause.
>>>>> Now, I can do some hacky things, like just deleting all of the expired
>>>>> tokens from the oldest single second, but that could also potentially
>>>>> be millions of tokens, and thus, millions of gaps to lock.
>>>>> So, there is just not one way to work for all databases, and we have to
>>>>> have a special mode for MySQL.
>>>>> I was wondering if anybody has suggestions and/or examples of how to do
>>>>> that with sqlalchemy.
>>>> Honestly, my answer is typically to ask Jay, he understands a lot of the
>>>> ways to get SQLA to do the right thing in mysql.
>>> LOL, /me blushes.
>>> In this case, I'd propose something like this, which should work fine
>>> for any database:
>>> cutoff = timeutils.utcnow() - 60 # one minute ago...
>>> # DELETE in 500 record chunks
>>> q = session.query(
>>> TokenModel.expires < cutoff)).limit(500)
>>> while True:
>>> results = q.all()
>>> if len(results):
>>> ids_to_delete = [r for r in results]
>>> Code not tested, use with caution, YMMV, etc etc...
>> It seems to me that it would still have the problem described in the
>> original post. Even if you are only deleteing 500 at atime, all of the
>> tokens from the original query will be locked...but I guess that the
>> "add new" behavior will only have to check against a subset of the
>> tokens in the database.
> Because it is only a read, only the index being used must be locked,
> and only the gaps in the range that were seen. So token.expires <=cutoff
> will be locked. Nothing creating new tokens will run into this.
> Of course, the locks can be avoided altogether with READ COMMITTED or
> READ UNCOMMITTED. The problem only manifests in a SELECT when using
> REPEATABLE READ.
Is this something we can implement? I realize it would be MySQL specific.
>> Are we really generating so many tokens that deleting the expired tokens
>> once per second is prohibitive? That points to something else being wrong.
> In a proof of concept deployment with somewhat constant load testing
> I have seen an average of 20k - 40k tokens per hour expire. Seems to
> me that tokens are just simply not being reused as that is about 10 -
> 20 times the number of actual operations done in that hour.
Yeah, that is why we are trying to push python-keyring as a client side
token caching solution. They should be reused.
> Running flush_tokens in a loop with sleep 1 would indeed work here. But
> if I ever stop running that in a loop, even for 15 minutes, then the
> tokens will back up and I'll be stuck with a massive delete again.
> I will test Jay's solution, it is closest to what I originally attempted
> but does not suffer from MySQL's limitations and will likely be a single
> method that works reasonably well for all dbs.
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
More information about the OpenStack-dev