[openstack-dev] [keystone] token revocation woes
Lance Bragstad
lbragstad at gmail.com
Thu Jul 23 14:45:31 UTC 2015
On Wed, Jul 22, 2015 at 10:06 PM, Adam Young <ayoung at redhat.com> wrote:
> On 07/22/2015 05:39 PM, Adam Young wrote:
>
> On 07/22/2015 03:41 PM, Morgan Fainberg wrote:
>
> This is an indicator that the bottleneck is not the db strictly speaking,
> but also related to the way we match. This means we need to spend some
> serious cycles on improving both the stored record(s) for revocation events
> and the matching algorithm.
>
>
> The simplest approach to revocation checking is to do a linear search
> through the events. I think the old version of the code that did that is
> in a code review, and I will pull it out.
>
> If we remove the tree, then the matching will have to run through each of
> the records and see if there is a match; the test will be linear with the
> number of records (slightly shorter if a token is actually revoked).
>
>
> This was the origianal, linear search version of the code.
>
>
> https://review.openstack.org/#/c/55908/50/keystone/contrib/revoke/model.py,cm
>
>
>
What initially landed for Revocation Events was the tree-structure, right?
We didn't land a linear approach prior to that and then switch to the tree,
did we?
>
>
>
>
>
>
> Sent via mobile
>
> On Jul 22, 2015, at 11:51, Matt Fischer <matt at mattfischer.com> wrote:
>
> Dolph,
>
> Per our IRC discussion, I was unable to see any performance improvement
> here although not calling DELETE so often will reduce the number of
> deadlocks when we're under heavy load especially given the globally
> replicated DB we use.
>
>
>
> On Tue, Jul 21, 2015 at 5:26 PM, Dolph Mathews <dolph.mathews at gmail.com>
> wrote:
>
>> Well, you might be in luck! Morgan Fainberg actually implemented an
>> improvement that was apparently documented by Adam Young way back in
>> March:
>>
>> https://bugs.launchpad.net/keystone/+bug/1287757
>>
>> There's a link to the stable/kilo backport in comment #2 - I'd be eager
>> to hear how it performs for you!
>>
>> On Tue, Jul 21, 2015 at 5:58 PM, Matt Fischer <matt at mattfischer.com>
>> wrote:
>>
>>> Dolph,
>>>
>>> Excuse the delayed reply, was waiting for a brilliant solution from
>>> someone. Without one, personally I'd prefer the cronjob as it seems to be
>>> the type of thing cron was designed for. That will be a painful change as
>>> people now rely on this behavior so I don't know if its feasible. I will be
>>> setting up monitoring for the revocation count and alerting me if it
>>> crosses probably 500 or so. If the problem gets worse then I think a custom
>>> no-op or sql driver is the next step.
>>>
>>> Thanks.
>>>
>>>
>>> On Wed, Jul 15, 2015 at 4:00 PM, Dolph Mathews <dolph.mathews at gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Wed, Jul 15, 2015 at 4:51 PM, Matt Fischer <matt at mattfischer.com>
>>>> wrote:
>>>>
>>>>> I'm having some issues with keystone revocation events. The bottom
>>>>> line is that due to the way keystone handles the clean-up of these
>>>>> events[1], having more than a few leads to:
>>>>>
>>>>> - bad performance, up to 2x slower token validation with about 600
>>>>> events based on my perf measurements.
>>>>> - database deadlocks, which cause API calls to fail, more likely with
>>>>> more events it seems
>>>>>
>>>>> I am seeing this behavior in code from trunk on June 11 using Fernet
>>>>> tokens, but the token backend does not seem to make a difference.
>>>>>
>>>>> Here's what happens to the db in terms of deadlock:
>>>>> 2015-07-15 21:25:41.082 31800 TRACE keystone.common.wsgi DBDeadlock:
>>>>> (OperationalError) (1213, 'Deadlock found when trying to get lock; try
>>>>> restarting transaction') 'DELETE FROM revocation_event WHERE
>>>>> revocation_event.revoked_at < %s' (datetime.datetime(2015, 7, 15, 18, 55,
>>>>> 41, 55186),)
>>>>>
>>>>> When this starts happening, I just go truncate the table, but this
>>>>> is not ideal. If [1] is really true then the design is not great, it sounds
>>>>> like keystone is doing a revocation event clean-up on every token
>>>>> validation call. Reading and deleting/locking from my db cluster is not
>>>>> something I want to do on every validate call.
>>>>>
>>>>
>>>> Unfortunately, that's *exactly* what keystone is doing. Adam and I
>>>> had a conversation about this problem in Vancouver which directly resulted
>>>> in opening the bug referenced on the operator list:
>>>>
>>>> https://bugs.launchpad.net/keystone/+bug/1456797
>>>>
>>>> Neither of us remembered the actual implemented behavior, which is
>>>> what you've run into and Deepti verified in the bug's comments.
>>>>
>>>>
>>>>>
>>>>> So, can I turn of token revocation for now? I didn't see an obvious
>>>>> no-op driver.
>>>>>
>>>>
>>>> Not sure how, other than writing your own no-op driver, or perhaps an
>>>> extended driver that doesn't try to clean the table on every read?
>>>>
>>>>
>>>>> And in the long-run can this be fixed? I'd rather do almost anything
>>>>> else, including writing a cronjob than what happens now.
>>>>>
>>>>
>>>> If anyone has a better solution than the current one, that's also
>>>> better than requiring a cron job on something like keystone-manage
>>>> revocation_flush I'd love to hear it.
>>>>
>>>>
>>>>> [1] -
>>>>> http://lists.openstack.org/pipermail/openstack-operators/2015-June/007210.html
>>>>>
>>>>>
>>>>> __________________________________________________________________________
>>>>> OpenStack Development Mailing List (not for usage questions)
>>>>> Unsubscribe:
>>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>>
>>>>>
>>>>
>>>>
>>>> __________________________________________________________________________
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe:
>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>>
>>>
>>>
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>>
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribehttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribehttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150723/f1c40090/attachment.html>
More information about the OpenStack-dev
mailing list