[openstack-dev] [keystone] token revocation woes
Adam Young
ayoung at redhat.com
Thu Jul 23 15:23:59 UTC 2015
On 07/23/2015 10:45 AM, Lance Bragstad wrote:
>
> On Wed, Jul 22, 2015 at 10:06 PM, Adam Young <ayoung at redhat.com
> <mailto:ayoung at redhat.com>> wrote:
>
> On 07/22/2015 05:39 PM, Adam Young wrote:
>> On 07/22/2015 03:41 PM, Morgan Fainberg wrote:
>>> This is an indicator that the bottleneck is not the db strictly
>>> speaking, but also related to the way we match. This means we
>>> need to spend some serious cycles on improving both the stored
>>> record(s) for revocation events and the matching algorithm.
>>
>> The simplest approach to revocation checking is to do a linear
>> search through the events. I think the old version of the code
>> that did that is in a code review, and I will pull it out.
>>
>> If we remove the tree, then the matching will have to run through
>> each of the records and see if there is a match; the test will
>> be linear with the number of records (slightly shorter if a token
>> is actually revoked).
>
> This was the origianal, linear search version of the code.
>
> https://review.openstack.org/#/c/55908/50/keystone/contrib/revoke/model.py,cm
>
>
>
>
> What initially landed for Revocation Events was the tree-structure,
> right? We didn't land a linear approach prior to that and then switch
> to the tree, did we?
Correct. The tree version made it into the review stack about revision
60 or so.
It might be useful to think of a hybrid approach. For example, it makes
no sense to do a linear search for revoke-by-userid, as that will be the
most common revocation, and also the one that will benefit the most from
a speed up. A binary search would probably make more sense there.
However, it does mean that the events would have to be sorted, which
might be more costly than the search itself, depending on how often it
is performed.
The one way that the events should always be sorted is by the "revoked
before" time, and this value can be found via a binary search.
>>
>>
>>
>>
>>
>>>
>>> Sent via mobile
>>>
>>> On Jul 22, 2015, at 11:51, Matt Fischer <matt at mattfischer.com
>>> <mailto:matt at mattfischer.com>> wrote:
>>>
>>>> Dolph,
>>>>
>>>> Per our IRC discussion, I was unable to see any performance
>>>> improvement here although not calling DELETE so often will
>>>> reduce the number of deadlocks when we're under heavy load
>>>> especially given the globally replicated DB we use.
>>>>
>>>>
>>>>
>>>> On Tue, Jul 21, 2015 at 5:26 PM, Dolph Mathews
>>>> <dolph.mathews at gmail.com <mailto:dolph.mathews at gmail.com>> wrote:
>>>>
>>>> Well, you might be in luck! Morgan Fainberg actually
>>>> implemented an improvement that was apparently documented
>>>> by Adam Young way back in March:
>>>>
>>>> https://bugs.launchpad.net/keystone/+bug/1287757
>>>>
>>>> There's a link to the stable/kilo backport in comment #2 -
>>>> I'd be eager to hear how it performs for you!
>>>>
>>>> On Tue, Jul 21, 2015 at 5:58 PM, Matt Fischer
>>>> <matt at mattfischer.com <mailto:matt at mattfischer.com>> wrote:
>>>>
>>>> Dolph,
>>>>
>>>> Excuse the delayed reply, was waiting for a brilliant
>>>> solution from someone. Without one, personally I'd
>>>> prefer the cronjob as it seems to be the type of thing
>>>> cron was designed for. That will be a painful change as
>>>> people now rely on this behavior so I don't know if its
>>>> feasible. I will be setting up monitoring for the
>>>> revocation count and alerting me if it crosses probably
>>>> 500 or so. If the problem gets worse then I think a
>>>> custom no-op or sql driver is the next step.
>>>>
>>>> Thanks.
>>>>
>>>>
>>>> On Wed, Jul 15, 2015 at 4:00 PM, Dolph Mathews
>>>> <dolph.mathews at gmail.com
>>>> <mailto:dolph.mathews at gmail.com>> wrote:
>>>>
>>>>
>>>>
>>>> On Wed, Jul 15, 2015 at 4:51 PM, Matt Fischer
>>>> <matt at mattfischer.com
>>>> <mailto:matt at mattfischer.com>> wrote:
>>>>
>>>> I'm having some issues with keystone revocation
>>>> events. The bottom line is that due to the way
>>>> keystone handles the clean-up of these
>>>> events[1], having more than a few leads to:
>>>>
>>>> - bad performance, up to 2x slower token
>>>> validation with about 600 events based on my
>>>> perf measurements.
>>>> - database deadlocks, which cause API calls to
>>>> fail, more likely with more events it seems
>>>>
>>>> I am seeing this behavior in code from trunk on
>>>> June 11 using Fernet tokens, but the token
>>>> backend does not seem to make a difference.
>>>>
>>>> Here's what happens to the db in terms of deadlock:
>>>> 2015-07-15 21:25:41.082 31800 TRACE
>>>> keystone.common.wsgi DBDeadlock:
>>>> (OperationalError) (1213, 'Deadlock found when
>>>> trying to get lock; try restarting
>>>> transaction') 'DELETE FROM revocation_event
>>>> WHERE revocation_event.revoked_at < %s'
>>>> (datetime.datetime(2015, 7, 15, 18, 55, 41,
>>>> 55186),)
>>>>
>>>> When this starts happening, I just go truncate
>>>> the table, but this is not ideal. If [1] is
>>>> really true then the design is not great, it
>>>> sounds like keystone is doing a revocation
>>>> event clean-up on every token validation call.
>>>> Reading and deleting/locking from my db cluster
>>>> is not something I want to do on every validate
>>>> call.
>>>>
>>>>
>>>> Unfortunately, that's *exactly* what keystone is
>>>> doing. Adam and I had a conversation about this
>>>> problem in Vancouver which directly resulted in
>>>> opening the bug referenced on the operator list:
>>>>
>>>> https://bugs.launchpad.net/keystone/+bug/1456797
>>>>
>>>> Neither of us remembered the actual implemented
>>>> behavior, which is what you've run into and Deepti
>>>> verified in the bug's comments.
>>>>
>>>>
>>>> So, can I turn of token revocation for now? I
>>>> didn't see an obvious no-op driver.
>>>>
>>>>
>>>> Not sure how, other than writing your own no-op
>>>> driver, or perhaps an extended driver that doesn't
>>>> try to clean the table on every read?
>>>>
>>>> And in the long-run can this be fixed? I'd
>>>> rather do almost anything else, including
>>>> writing a cronjob than what happens now.
>>>>
>>>>
>>>> If anyone has a better solution than the current
>>>> one, that's also better than requiring a cron job
>>>> on something like keystone-manage revocation_flush
>>>> I'd love to hear it.
>>>>
>>>>
>>>> [1] -
>>>> http://lists.openstack.org/pipermail/openstack-operators/2015-June/007210.html
>>>>
>>>> __________________________________________________________________________
>>>> OpenStack Development Mailing List (not for
>>>> usage questions)
>>>> Unsubscribe:
>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>>
>>>>
>>>> __________________________________________________________________________
>>>> OpenStack Development Mailing List (not for usage
>>>> questions)
>>>> Unsubscribe:
>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>>
>>>>
>>>> __________________________________________________________________________
>>>> OpenStack Development Mailing List (not for usage
>>>> questions)
>>>> Unsubscribe:
>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>>
>>>>
>>>> __________________________________________________________________________
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe:
>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>>
>>>> __________________________________________________________________________
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe: OpenStack-dev-request at lists.openstack.org
>>>> <mailto:OpenStack-dev-request at lists.openstack.org>?subject:unsubscribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>>
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:OpenStack-dev-request at lists.openstack.org?subject:unsubscribe <mailto:OpenStack-dev-request at lists.openstack.org?subject:unsubscribe>
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:OpenStack-dev-request at lists.openstack.org?subject:unsubscribe <mailto:OpenStack-dev-request at lists.openstack.org?subject:unsubscribe>
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150723/c60810ee/attachment.html>
More information about the OpenStack-dev
mailing list