[openstack-dev] [keystone]PKI token VS Fernet token
lbragstad at gmail.com
Sat Feb 25 19:07:58 UTC 2017
On Sat, Feb 25, 2017 at 12:47 AM, Clint Byrum <clint at fewbar.com> wrote:
> Excerpts from joehuang's message of 2017-02-25 04:09:45 +0000:
> > Hello, Matt,
> > Thank you for your reply, just as what you mentioned, for the slow
> changed data, aync. replication should work. My concerns is that the impact
> of replication delay, for example (though it's quite low chance to happen):
> > 1) Add new user/group/role in RegionOne, before the new user/group/role
> are replicated to RegionTwo, the new user begin to access RegionTwo
> service, then because the data has not arrived yet, the user's request to
> RegionTwo may be rejected for the token vaildation failed in local KeyStone.
> I think this is entirely acceptable. You can even check with your
> monitoring system to find out what the current replication lag is to
> each region, and notify the user of how long it may take.
> > 2)In token revoke case. If we remove the user'role in RegionOne, the
> token in RegionOne will be invalid immediately, but before the remove
> operation replicated to the RegionTwo, the user can still use the token to
> access the services in RegionTwo. Although it may last in very short
> > Is there someone can evaluate the security risk is affordable or not.
> The simple answer is that the window between a revocation event being
> created, and being ubiquitous, is whatever the maximum replication lag
> is between regions. So if you usually have 5 seconds of replication lag,
> it will be 5 seconds. If you have a really write-heavy day, and you
> suddenly have 5 minutes of replication lag, it will be 5 minutes.
> The complicated component is that in async replication, reducing
> replication lag is expensive. You don't have many options here. Reducing
> writes on the master is one of them, but that isn't easy! Another is
> filtering out tables on slaves so that you only replicate the tables
> that you will be reading. But if there are lots of replication events,
> that doesn't help.
This is a good point and something that was much more prevalent with UUID
tokens. We still write *all* the data from a UUID token to the database,
which includes the user, project, scope, possibly the service catalog,
etc... When validating a UUID token, it would be pulled from the database
and returned to the user. The information in the UUID token wasn't
confirmed at validation time. For example, if you authenticated for a UUID
token scoped to a project with the `admin` role, the role and project
information persisted in the database would reflect that. If your `admin`
role assignment was removed from the project and you validated the token,
the token reference in the database would still contain `admin` scope on
the project. At the time the approach to fixing this was to create a
revocation event that would match specific attributes of that token (i.e.
the `admin` role on that specific project). As a result, the token
validation process would pull the token from the backend, then pass it to
the revocation API and ask if the token was revoked based on any
pre-existing revocation events.
The fernet approach to solving this was fundamentally different because we
didn't have a token reference to pull from the backend that represented the
authorization context at authentication time (which we did have with UUID).
Instead, what we can do at validation time is decrypt the token and ask the
assignment API for role assignments given a user and project  and raise
a 401 if that user has no roles on the project . So, by rebuilding the
authorization context at validation time, we no longer need to rely on
revocation events to enforce role revocation (but we do need them to
enforce revocation for other things with fernet). The tradeoff is that
performance degrades if you're using fernet without caching because we have
to rebuild all of that information, instead of just returning a reference
from the database. This led to us making significant improvements to our
caching implementation in keystone so that we can improve token validation
time overall, especially for fernet. As of last release UUID tokens are now
validated the same exact way as fernet tokens are. Our team also made some
improvements listing and comparing token references in the revocation API
  (thanks to Richard, Clint, and Ron for driving a lot of that work!).
Since both token formats rebuild the authorization context at validation
time, we can remove some revocation events that are no longer needed. This
means we won't be storing as many revocation events on role removal from
domains and projects. Instead we will only rely on the revocation API to
invalidate tokens for cases like specific token revocation or password
changes (the new design of validation does role assignment enforcement for
us automatically). This should reduce the amount of data being replicated
due to massive amounts of revocation events.
We do still have some more work to do on this front, but I can dig into it
and see what's left.
> One decent option is to switch to semi-sync replication:
> That will at least make sure your writes aren't acknowledged until the
> binlogs have been transferred everywhere. But if your master can take
> writes a lot faster than your slaves, you may never catch up applying , no
> how fast the binlogs are transferred.
> The key is to evaluate your requirements and think through these
> solutions. Good luck! :)
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OpenStack-dev