[openstack-dev] [Keystone] Best way to do something MySQL-specific?

Adam Young ayoung at redhat.com
Thu Jul 11 03:26:31 UTC 2013


On 07/10/2013 11:11 PM, Clint Byrum wrote:
> Excerpts from Adam Young's message of 2013-07-10 19:17:24 -0700:
>> On 07/09/2013 07:33 PM, Jay Pipes wrote:
>>> On 07/08/2013 05:18 PM, Sean Dague wrote:
>>>> On 07/01/2013 01:35 PM, Clint Byrum wrote:
>>>>> The way the new keystone-manage command "token_flush" works right now
>>>>> is quite broken by MySQL and InnoDB's gap locking behavior:
>>>>>
>>>>> https://bugs.launchpad.net/1188378
>>>>>
>>>>> Presumably other SQL databases like PostgreSQL will have similar
>>>>> problems
>>>>> with doing massive deletes, but I am less familiar with them.
>>>>>
>>>>> I am trying to solve this in keystone, and my first attempt is here:
>>>>>
>>>>> https://review.openstack.org/#/c/32044/
>>>>>
>>>>> However, MySQL does not support using "LIMIT" in a sub-query that
>>>>> is feeding an IN() clause, so that approach will not work. Likewise,
>>>>> sqlalchemy does not support the MySQL specific extension to DELETE
>>>>> which
>>>>> allows it to have a LIMIT clause.
>>>>>
>>>>> Now, I can do some hacky things, like just deleting all of the expired
>>>>> tokens from the oldest single second, but that could also potentially
>>>>> be millions of tokens, and thus, millions of gaps to lock.
>>>>>
>>>>> So, there is just not one way to work for all databases, and we have to
>>>>> have a special mode for MySQL.
>>>>>
>>>>> I was wondering if anybody has suggestions and/or examples of how to do
>>>>> that with sqlalchemy.
>>>> Honestly, my answer is typically to ask Jay, he understands a lot of the
>>>> ways to get SQLA to do the right thing in mysql.
>>> LOL, /me blushes.
>>>
>>> In this case, I'd propose something like this, which should work fine
>>> for any database:
>>>
>>> cutoff = timeutils.utcnow() - 60  # one minute ago...
>>>
>>> # DELETE in 500 record chunks
>>> q = session.query(
>>>          TokenModel.id).filter(
>>>              TokenModel.expires < cutoff)).limit(500)
>>> while True:
>>>      results = q.all()
>>>      if len(results):
>>>          ids_to_delete = [r[0] for r in results]
>>>          session.query(TokenModel).filter(
>>>              TokenModel.id.in_(ids_to_delete)).delete()
>>>      else:
>>>          break
>>>
>>> Code not tested, use with caution, YMMV, etc etc...
>>
>> It seems to me that it would still have the problem described in the
>> original post.  Even if you are only deleteing 500 at atime, all of the
>> tokens from the original query will be locked...but I guess that the
>> "add new" behavior will only have to check against a subset of the
>> tokens in the database.
>>
> Because it is only a read, only the index being used must be locked,
> and only the gaps in the range that were seen. So token.expires <=cutoff
> will be locked. Nothing creating new tokens will run into this.
>
> Of course, the locks can be avoided altogether with READ COMMITTED or
> READ UNCOMMITTED. The problem only manifests in a SELECT when using
> REPEATABLE READ.

Is this something we can implement?  I realize it would be MySQL specific.

>
>> Are we really generating so many tokens that deleting the expired tokens
>> once per second is prohibitive?  That points to something else being wrong.
>>
> In a proof of concept deployment with somewhat constant load testing
> I have seen an average of 20k - 40k tokens per hour expire. Seems to
> me that tokens are just simply not being reused as that is about 10 -
> 20 times the number of actual operations done in that hour.
Yeah, that is why we are trying to push python-keyring as a client side 
token caching solution. They should be reused.



>
> Running flush_tokens in a loop with sleep 1 would indeed work here. But
> if I ever stop running that in a loop, even for 15 minutes, then the
> tokens will back up and I'll be stuck with a massive delete again.
>
> I will test Jay's solution, it is closest to what I originally attempted
> but does not suffer from MySQL's limitations and will likely be a single
> method that works reasonably well for all dbs.
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




More information about the OpenStack-dev mailing list