[openstack-dev] [nova][neutron][mysql] IMPORTANT: MySQL Galera does *not* support SELECT ... FOR UPDATE

Joe Gordon joe.gordon0 at gmail.com
Tue May 20 03:14:56 UTC 2014


On Mon, May 19, 2014 at 10:30 AM, Jay Pipes <jaypipes at gmail.com> wrote:

> Stackers,
>
> On Friday in Atlanta, I had the pleasure of moderating the database
> session at the Ops Meetup track. We had lots of good discussions and heard
> important feedback from operators on DB topics.
>
> For the record, I would not bring this point up so publicly unless I
> believed it was a serious problem affecting a large segment of users. When
> doing an informal survey of the users/operators in the room at the start of
> the session, out of approximately 200 people in the room, only a single
> person was using PostgreSQL, about a dozen were using standard MySQL
> master/slave replication, and the rest were using MySQL Galera clustering.
> So, this is a real issue for a large segment of the operators -- or at
> least the ones at the session. :)
>
> Peter Boros, from Percona, was able to provide some insight on MySQL
> Galera topics, and one issue came up that is likely the cause of a lot of
> heartache for operators who use MySQL Galera (or Percona XtraDB Cluster).
>
> We were discussing whether people had seen deadlock issues [1] when using
> MySQL Galera in their deployment, and were brainstorming on why deadlocks
> might be seen. I had suggested that perhaps Nova's use of autoincrementing
> primary keys may have been the cause. Peter pretty quickly dispatched that
> notion, saying that Galera automatically handles autoincrementing keys
> using managed innodb_autoincrement_increment and
> innodb_autoincrement_offset config options.
>
> I think at that point I mentioned that there were a number of places that
> were using the SELECT ... FOR UPDATE construct in Nova (in SQLAlchemy, it's
> the with_lockmode('update') modification of the query object). Peter
> promptly said that was a problem. MySQL Galera does not support SELECT ...
> FOR UPDATE, since it has no concept of cross-node locking of records and
> results are non-deterministic.
>
> So... what to do?
>
> For starters, some information on the use of with_lockmode() in Nova and
> Neutron...
>
> Within Nova, there are actually only a few places where
> with_lockmode('update') is used. Unfortunately, the use of
> with_lockmode('update') is in the quota code, which tends to wrap largish
> blocks of code within the Nova compute execution code.
>
> Within Neutron, however, the use of with_lockmode('update') is all over
> the place. There are 44 separate uses of it in 11 different files.
>
> We have a number of options:
>
> 1) Stop using MySQL Galera for databases of projects that contain
> with_lockmode('update')
>
> 2) Put a big old warning in the docs somewhere about the problem of
> potential deadlocks or odd behaviour with Galera in these projects
>

I think we should do this for Icehouse regardless of what other paths we
take.


>
> 3) For Nova and Neutron, remove the use of with_lockmode('update') and
> instead use a coarse-grained file lock or a distributed lock manager for
> those areas where we need deterministic reads or quiescence.
>
> 4) For the Nova db quota driver, refactor the driver to either use a
> non-locking method for reservation and quota queries or move the driver out
> into its own projects (or use something like Climate and make sure that
> Climate uses a non-blocking algorithm for those queries...)
>
> Thoughts?
>
> -jay
>
> [1] http://lists.openstack.org/pipermail/openstack/2014-May/007202.html
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140519/f8984975/attachment.html>


More information about the OpenStack-dev mailing list