Open Stack

Wed Jul 22 19:41:37 UTC 2015

This is an indicator that the bottleneck is not the db strictly speaking, but also related to the way we match. This means we need to spend some serious cycles on improving both the stored record(s) for revocation events and the matching algorithm. 

Sent via mobile

> On Jul 22, 2015, at 11:51, Matt Fischer <matt at mattfischer.com> wrote:
> 
> Dolph,
> 
> Per our IRC discussion, I was unable to see any performance improvement here although not calling DELETE so often will reduce the number of deadlocks when we're under heavy load especially given the globally replicated DB we use.
> 
> 
> 
>> On Tue, Jul 21, 2015 at 5:26 PM, Dolph Mathews <dolph.mathews at gmail.com> wrote:
>> Well, you might be in luck! Morgan Fainberg actually implemented an improvement that was apparently documented by Adam Young way back in March: 
>> 
>>   https://bugs.launchpad.net/keystone/+bug/1287757
>> 
>> There's a link to the stable/kilo backport in comment #2 - I'd be eager to hear how it performs for you!
>> 
>>> On Tue, Jul 21, 2015 at 5:58 PM, Matt Fischer <matt at mattfischer.com> wrote:
>>> Dolph,
>>> 
>>> Excuse the delayed reply, was waiting for a brilliant solution from someone. Without one, personally I'd prefer the cronjob as it seems to be the type of thing cron was designed for. That will be a painful change as people now rely on this behavior so I don't know if its feasible. I will be setting up monitoring for the revocation count and alerting me if it crosses probably 500 or so. If the problem gets worse then I think a custom no-op or sql driver is the next step.
>>> 
>>> Thanks.
>>> 
>>> 
>>>> On Wed, Jul 15, 2015 at 4:00 PM, Dolph Mathews <dolph.mathews at gmail.com> wrote:
>>>> 
>>>> 
>>>>> On Wed, Jul 15, 2015 at 4:51 PM, Matt Fischer <matt at mattfischer.com> wrote:
>>>>> I'm having some issues with keystone revocation events. The bottom line is that due to the way keystone handles the clean-up of these events[1], having more than a few leads to:
>>>>> 
>>>>>  - bad performance, up to 2x slower token validation with about 600 events based on my perf measurements.
>>>>>  - database deadlocks, which cause API calls to fail, more likely with more events it seems
>>>>> 
>>>>> I am seeing this behavior in code from trunk on June 11 using Fernet tokens, but the token backend does not seem to make a difference.
>>>>> 
>>>>> Here's what happens to the db in terms of deadlock:
>>>>> 2015-07-15 21:25:41.082 31800 TRACE keystone.common.wsgi DBDeadlock: (OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction') 'DELETE FROM revocation_event WHERE revocation_event.revoked_at < %s' (datetime.datetime(2015, 7, 15, 18, 55, 41, 55186),)
>>>>> 
>>>>> When this starts happening, I just go truncate the table, but this is not ideal. If [1] is really true then the design is not great, it sounds like keystone is doing a revocation event clean-up on every token validation call. Reading and deleting/locking from my db cluster is not something I want to do on every validate call.
>>>> 
>>>> Unfortunately, that's *exactly* what keystone is doing. Adam and I had a conversation about this problem in Vancouver which directly resulted in opening the bug referenced on the operator list:
>>>> 
>>>>   https://bugs.launchpad.net/keystone/+bug/1456797
>>>> 
>>>> Neither of us remembered the actual implemented behavior, which is what you've run into and Deepti verified in the bug's comments.
>>>>  
>>>>> 
>>>>> So, can I turn of token revocation for now? I didn't see an obvious no-op driver.
>>>> 
>>>> Not sure how, other than writing your own no-op driver, or perhaps an extended driver that doesn't try to clean the table on every read?
>>>>  
>>>>> And in the long-run can this be fixed? I'd rather do almost anything else, including writing a cronjob than what happens now.
>>>> 
>>>> If anyone has a better solution than the current one, that's also better than requiring a cron job on something like keystone-manage revocation_flush I'd love to hear it.
>>>> 
>>>>> 
>>>>> [1] - http://lists.openstack.org/pipermail/openstack-operators/2015-June/007210.html
>>>>> 
>>>>> __________________________________________________________________________
>>>>> OpenStack Development Mailing List (not for usage questions)
>>>>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>> 
>>>> 
>>>> __________________________________________________________________________
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> 
>>> 
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> 
>> 
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150722/4a7d0587/attachment.html>

Open Stack

[openstack-dev] [keystone] token revocation woes

OpenStack

Community

Documentation

Branding & Legal