[openstack-dev] [nova][neutron][mysql] IMPORTANT: MySQL Galera does *not* support SELECT ... FOR UPDATE

Joshua Harlow harlowja at yahoo-inc.com
Mon May 19 22:42:34 UTC 2014


Please lets not disregard zookeeper (zab based) or etc.d or other raft[1] based projects.

I'd also not rather bet on concoord (which seems to be maintained as a university project by 2 people @ cornell).

Even if zookeeper is java the constructs provided by kazoo[3] should be directly usable (tooz[2] provides nice helpers for these and we should consider making tooz better if it doesn't fit imho).

In my opinion we should focus on tooz maturation and having provably correct drivers in it for lock management (integrate zookeeper or etc.d or other impls at this level) and let deployers decide what they are comfortable using (I'd be ok with zookeeper, as it's already well used here at yahoo).

[1] http://raftconsensus.github.io/
[2] https://github.com/stackforge/tooz
[3] https://github.com/python-zk/kazoo/tree/master/kazoo/recipe

From: Salvatore Orlando <sorlando at nicira.com<mailto:sorlando at nicira.com>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Monday, May 19, 2014 at 3:26 PM
To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject: Re: [openstack-dev] [nova][neutron][mysql] IMPORTANT: MySQL Galera does *not* support SELECT ... FOR UPDATE

Some comments inline.

Salvatore


On 19 May 2014 20:32, sridhar basam <sridhar.basam at gmail.com<mailto:sridhar.basam at gmail.com>> wrote:



On Mon, May 19, 2014 at 1:30 PM, Jay Pipes <jaypipes at gmail.com<mailto:jaypipes at gmail.com>> wrote:
Stackers,

On Friday in Atlanta, I had the pleasure of moderating the database session at the Ops Meetup track. We had lots of good discussions and heard important feedback from operators on DB topics.

For the record, I would not bring this point up so publicly unless I believed it was a serious problem affecting a large segment of users. When doing an informal survey of the users/operators in the room at the start of the session, out of approximately 200 people in the room, only a single person was using PostgreSQL, about a dozen were using standard MySQL master/slave replication, and the rest were using MySQL Galera clustering. So, this is a real issue for a large segment of the operators -- or at least the ones at the session. :)


​We are one of those operators that use Galera for replicating our mysql databases. We used to  see issues with deadlocks when having multiple mysql writers in our mysql cluster. As a workaround we have our haproxy configuration in an active-standby configuration for our mysql VIP.

I seem to recall we had a lot of the deadlocks happen through Neutron. When we go through our Icehouse testing, we will redo our multimaster mysql setup and provide feedback on the issues we see.

The SELECT... FOR UPDATE issue is going to be a non trivial one for neutron as well. Some components, like IPAM, heavily rely on it.
However, Neutron is a lot more susceptible to deadlock problems than nova because it does not implement at the moment a retry mechanism.
This is something which should be added during the Juno release cycle regardless of all the other enhancement currently being planned, such as task oriented operations.

thanks,
         Sridhar


Peter Boros, from Percona, was able to provide some insight on MySQL Galera topics, and one issue came up that is likely the cause of a lot of heartache for operators who use MySQL Galera (or Percona XtraDB Cluster).

We were discussing whether people had seen deadlock issues [1] when using MySQL Galera in their deployment, and were brainstorming on why deadlocks might be seen. I had suggested that perhaps Nova's use of autoincrementing primary keys may have been the cause. Peter pretty quickly dispatched that notion, saying that Galera automatically handles autoincrementing keys using managed innodb_autoincrement_increment and innodb_autoincrement_offset config options.

I think at that point I mentioned that there were a number of places that were using the SELECT ... FOR UPDATE construct in Nova (in SQLAlchemy, it's the with_lockmode('update') modification of the query object). Peter promptly said that was a problem. MySQL Galera does not support SELECT ... FOR UPDATE, since it has no concept of cross-node locking of records and results are non-deterministic.

So... what to do?

For starters, some information on the use of with_lockmode() in Nova and Neutron...

Within Nova, there are actually only a few places where with_lockmode('update') is used. Unfortunately, the use of with_lockmode('update') is in the quota code, which tends to wrap largish blocks of code within the Nova compute execution code.

Within Neutron, however, the use of with_lockmode('update') is all over the place. There are 44 separate uses of it in 11 different files.


I will report on a separate thread on this, so that we can have an assessment of where locking statements are used and why.

We have a number of options:

I thin option 0 should be to rework/redesign the code, where possible, to avoid DB-level locking at all.


1) Stop using MySQL Galera for databases of projects that contain with_lockmode('update')

This looks hideous, but I am afraid this is what all people wishing to deploy Icehouse should consider doing.


2) Put a big old warning in the docs somewhere about the problem of potential deadlocks or odd behaviour with Galera in these projects

3) For Nova and Neutron, remove the use of with_lockmode('update') and instead use a coarse-grained file lock or a distributed lock manager for those areas where we need deterministic reads or quiescence.

We had an attempt at implementing a sort of distributed lock for neutron: https://review.openstack.org/#/c/34695/
Beyond the implementation reservations on this patch, one thing that should be noticed, probably needless to say, is that distributed coordination is something that should never be taken in a light-hearted way.
Once all the non-locking solution have been ruled out, distributed coordination among processes could be considered. In that case I think it might be better to use some OTS software rather than working out some home grown solution (I surely do not see space for a new project here)
On a side note, I'm rather ignorant on python frameworks for distributed coordination... concoord? Is zookeper something that should be ruled out because of language restrictions?



4) For the Nova db quota driver, refactor the driver to either use a non-locking method for reservation and quota queries or move the driver out into its own projects (or use something like Climate and make sure that Climate uses a non-blocking algorithm for those queries...)

Thoughts?

-jay

[1] http://lists.openstack.org/pipermail/openstack/2014-May/007202.html

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140519/6acad278/attachment.html>


More information about the OpenStack-dev mailing list