[openstack-dev] [Keystone] Best way to do something MySQL-specific?

Jamie Lennox jlennox at redhat.com
Tue Jul 9 03:34:11 UTC 2013


On Mon, 2013-07-08 at 21:55 -0400, Adam Young wrote:
> 
> Tokens are, for the most part, immutable.  Once they are written, they
> don't change except if they get revoked.  This is a fairly rare
> occurance, but it does happen.
> 
> Deleting tokens based on age should be fairly straight forward, and
> locks should not need to be held for a significant amount of time.
> 
> My guess, however, is that the problem is SQL Alchemy:
> 
>         query = session.query(TokenModel)
>         query = query.filter(TokenModel.expires < timeutils.utcnow())
>         query.delete(synchronize_session=False)
> 
> If it is doing a fetch and then the delete, then the rows would be
> held for a short period of time.
> 
> Direct SQL might be a better approach:  prepare a statement:
> 
> "delete from token where expires < $1"

Sqlalchemy already generates this statement.  

> and then bind and execute in one command.
> 
> However, it seems to me that the conflict detection is the problem.  I
> don't know if there is a way to state "ignore any future queries that
> would match this criteria."   It does seem to me that even doing this
> degree of conflict detection is somewhat violating the principal of
> Isolation.
> 
> There might be an approach using table partitioning as well, where you
> only write to partition one, and delete from partition 2, and then
> swap.
> 
> 
> 
> 
> 
> On 07/08/2013 09:13 PM, Robert Collins wrote:
> 
> > On 9 July 2013 12:32, Adam Young <ayoung at redhat.com> wrote:
> >  
> >                 * I am asking about MySQL.. presumably a "real"
> >                 database.
> >          I have to admit I am a bit of a Postgresql Bigot. I don't
> >         really consider MySQL a real database, althought it has
> >         improved a lot over the years.  I am not up to speed
> >         on"InnoDB's gap locking behavior" but it is not something I
> >         would expect to be a problem in Postgresql.
> > 
> > PostgreSQL has similar but different characteristics, particular the
> > latest iteration of isolation behaviour where locks are held on *the
> > result of a query*, not on 'specific rows returned' - the difference
> > being that adding a new row that matches the query for rows to
> > delete, would encounter a conflict. You also need to delete small
> > numbers of rows at a time, though the reason in the plumbing is
> > different. There are some nasty interlocks you can cause with very
> > large deletes and autovacuum too - if you trigger deadlock detection
> > it still takes /minutes/ to detect and cleanup, whereas we want
> > sub-second liveness.
> >  
> >         once every second would be strange indeed.  I would think
> >         maybe once every five minutes or so.  Schedule your clean up
> >         IAW your deployment and usage.
> > 
> > 5m intervals exacerbate the issue until it's solved. If the cleanup
> > deletes no more than (say) 1000 rows per iteration, it could run
> > every 5 minutes but when run keep going until the db is cleaned.
> >  
> >         Deleting a chunk of tokens in bulk would be preferable to
> >         doing client side iteration, I can;t see how that would not
> >         be the case.
> > 
> > right, so I think Clint prefers that too, the question is how to get
> > sqlalchemy to output the appropriate sql for postgresql and mysql,
> > which is different. 
> > 

I'm not experienced with large databases but i wrote the token_flush so
i'm interested. What am i missing that we can't just add a --limit
parameter to the command line tool so "keystone-manage token_flush
--limit=1000" which is (as was mentioned) deleting 1000 in a command. 

SA will be able to generate a nested query like:
"delete from token where id in (select id from token limit 1000)" 
for all databases. What sort of hit is something nested like that? Then
you tweak the limit and the frequency but i would suggest 1000 every 5
minutes should result in a net negative token count and wouldn't come
close to the locks. 

It will take time to clear the backlog but it is supposed to be a
maintenance utility. 


> > 
> > -Rob
> >  
> > 
> > -- 
> > Robert Collins <rbtcollins at hp.com>
> > Distinguished Technologist
> > HP Cloud Services 
> > 
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev






More information about the OpenStack-dev mailing list