[openstack-dev] [Keystone] Best way to do something MySQL-specific?

Clint Byrum clint at fewbar.com
Tue Jul 9 15:21:04 UTC 2013

Excerpts from Adam Young's message of 2013-07-09 06:19:19 -0700:
> On 07/08/2013 11:06 PM, Clint Byrum wrote:
> > Excerpts from Adam Young's message of 2013-07-08 17:32:30 -0700:
> >> On 07/08/2013 04:35 PM, Clint Byrum wrote:
> >>> Excerpts from Adam Young's message of 2013-07-08 13:18:55 -0700:
> >>>> On 07/01/2013 01:35 PM, Clint Byrum wrote:
> >>>>> The way the new keystone-manage command "token_flush" works right now
> >>>>> is quite broken by MySQL and InnoDB's gap locking behavior:
> >>>>>
> >>>>> https://bugs.launchpad.net/1188378
> >>>>>
> >>>>> Presumably other SQL databases like PostgreSQL will have similar problems
> >>>>> with doing massive deletes, but I am less familiar with them.
> >>>>>
> >>>>> I am trying to solve this in keystone, and my first attempt is here:
> >>>>>
> >>>>> https://review.openstack.org/#/c/32044/
> >>>>>
> >>>>> However, MySQL does not support using "LIMIT" in a sub-query that
> >>>>> is feeding an IN() clause, so that approach will not work. Likewise,
> >>>>> sqlalchemy does not support the MySQL specific extension to DELETE which
> >>>>> allows it to have a LIMIT clause.
> >>>>>
> >>>>> Now, I can do some hacky things, like just deleting all of the expired
> >>>>> tokens from the oldest single second, but that could also potentially
> >>>>> be millions of tokens, and thus, millions of gaps to lock.
> >>>>>
> >>>>> So, there is just not one way to work for all databases, and we have to
> >>>>> have a special mode for MySQL.
> >>>>>
> >>>>> I was wondering if anybody has suggestions and/or examples of how to do
> >>>>> that with sqlalchemy.
> >>>>>
> >>>>> _______________________________________________
> >>>>> OpenStack-dev mailing list
> >>>>> OpenStack-dev at lists.openstack.org
> >>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >>>> In general, if you have millions of roles, you need a real database.  I
> >>>> would not try to work through SQL Alchemy for this. Instead, you
> >>>> probably just want to make sure that the token_flush is run fairly
> >>>> regularly on your database.
> >>>>
> >>> I'm not sure I understand you.
> >>>
> >>> * I am asking about millions of tokens, not roles.
> >> Heh, I mean Rows, and somehow type roles.
> >>
> >>> * I am asking about MySQL.. presumably a "real" database.
> >>    I have to admit I am a bit of a Postgresql Bigot. I don't really
> >> consider MySQL a real database, althought it has improved a lot over the
> >> years.  I am not up to speed on"InnoDB's gap locking behavior" but it is
> >> not something I would expect to be a problem in Postgresql.
> >>
> > You may want to update your snark! MySQL has its warts which are pretty
> > easy to take shots at, but it has been a "real" ACID compliant database
> > for well over a decade.
> My snark is more recently generated than that.  There are plenty of 
> places where MySQL has fallen down.
> InnoDB support was not mandatory, and without it, MySQL was not really 
> ACID compliant.  Using InnoDB was troublesome enough that the RHEL 6 
> version of MySQL defaults to MyISAM.

Its not so much that it was troublesome to use InnoDB as it was that
people were used to MyISAM's few mis-features (fulltext and delayed
inserts mostly) and needed a long warning (run up to 5.5) that the
default was changing. Before 5.5 was released, Drizzle _completely
removed_ MyISAM because it causes way more problems than it solves.

> >
> > Please have a read at how InnoDB's isolation level handling works
> > [1]. You can compare it to Postgres [2]. InnoDB's default isolation
> > level, repeatable read, has a locking behavior that is particularly
> > ugly for indexes like 'token.valid'. Put simply, you can't add more
> > valid tokens until you're done deleting any valid expired tokens (if
> > you mention valid in the delete, which flush_tokens does not).
> >
> > This is because token.valid == 1 may end up locked during the entire
> > DELETE. The same could be true for expires too, even the primary key,
> > though that is typically not locked with ranges.
> >
> > Yes, Postgres's behaviors are much more desirable here, as inserts almost
> > go unmentioned in the postgres description of transaction isolation. But
> > there are well documented ways of working around this weirdness. I had
> > forgotten that one way is to use READ UNCOMMITTED. That should avoid
> > any of the large range locks and work with MySQL, Postgres, and Sqlite.
> I think the real solution is for us to stop writing tokens to the 
> Database.  If we are using PKI tokens, there is no need for a database 
> entry for valid tokens, only for revoked tokens.
> That makes this whole thing into a non-problem.

Yes please! Is this something that is in development, or at least recorded
in a bug that someone like me can grab? I would love for that to be the
default behavior.

Of course, that would not solve the issue for those who want to audit
tokens (is this a large contingent of users?). In fact the flush tokens
command would be bad for those users as well, whereas pt-archiver's
nibbling approach that copies rows out of the table into another database
would work quite nicely.

More information about the OpenStack-dev mailing list