[openstack-dev] [nova][neutron][mysql] IMPORTANT: MySQL Galera does *not* support SELECT ... FOR UPDATE

Peter Boros peter.boros at percona.com
Tue May 20 13:51:15 UTC 2014


Hi,

Also it would be nice to confirm that really SELECT FOR UPDATES are
causing the deadlocks. Since these are row lock waits in a single node
case, with a slow log from a single node, pt-query-digest can help to
determine this.

pt-query-digest /path/to/slow.log --order-by InnoDB_rec_lock_wait:sum
> digest-rec_lock_waits.txt

It will show what statements waited for locks most, these will most
likely be the ones causing the deadlock issues in case of multi-node
writing.

On Tue, May 20, 2014 at 2:27 PM, Peter Boros <peter.boros at percona.com> wrote:
> Hi,
>
> I would like to shed some additional light on this for those who were
> not there. So, SELECT ... FOR UPDATE does lock on a single node, as it
> is pointed out earlier in this thread, a simple solution is to write
> only one node at a time. Haproxy can be set up with both backends, see
> this blog post for example.
> http://www.mysqlperformanceblog.com/2012/06/20/percona-xtradb-cluster-reference-architecture-with-haproxy/
>
> In a nutshell, and with a bit of an oversimplification, galera
> replicates in write sets. A write set is practically a row based
> binary log event + some metadata which is good for 2 things: you can
> take a look at 2 write sets and tell if they are conflicting or not,
> and you can take a look at a writeset and a database, and tell if the
> write set is applicable to the database. At the time of commit, the
> transaction is transferred to all the other cluster nodes in parallel.
> On the remote node, the new transaction is compared to each other
> transaction waiting in the queue to be applied, and it's checked if
> it's applicable to the database. If the transaction if not
> conflicting, and it's applicable, it's queued, and the node signals
> back that the commit can proceed. There is a nice drawing about this
> here:
>
> http://www.percona.com/doc/percona-xtradb-cluster/5.6/features/multimaster-replication.html
>
> So, because of this, the locks of SELECT FOR UPDATE won't replicate.
> Between nodes, galera uses optimistic locking. This means that we
> assume that during the certification process (described above), there
> will be no conflicts. If there are conflicts, the transaction is
> rolled back on the originating node, and this is when you receive the
> error message in question. A failed transaction is something which can
> happen any time with any database engine with any "interesting
> feature", and when a transaction failed, the application should now
> what to do with it. In case of galera, a conflict in a single node
> case was a wait on row locks, in case of galera replication, this will
> be a rollback. A rollback is a much more expensive operation (data has
> to be copied back from undo), so if there are lots of failures like
> this, performance will suffer.
> So, this is not a deadlock in the classical sense. Yet, InnoDB can
> roll back a transaction any time because of a deadlock (any database
> engine can do that, including PostgreSQL), and the application should
> be able to handle this.
>
> As it was noted earlier, writing to a single node only at a time is a
> good solution for avoiding this. With multiple nodes written, storage
> engine level writes will still happen on every node, because every
> node has the whole data set. Writing on multiple nodes can be
> beneficial because parsing SQL is much more expensive than just
> applying a row based binary log event, so you can see some performance
> improvement if all nodes are written.
>
> I would discourage using any type of multi-master replication without
> understanding how conflict resolution works in case of the chosen
> solution. In case of galera, if row locks were replicated over the
> network, it would act the same way as a single server, but it would be
> really slow. If SELECT FOR UPDATE is only used to achieve consistent
> reads (read your own writes), that can be achieved with
> wsrep_causal_reads. I am happy to help to avoid SELECT FOR UPDATE if
> somebody can tell me the use cases.
>
> On Tue, May 20, 2014 at 10:53 AM, Julien Danjou <julien at danjou.info> wrote:
>> On Mon, May 19 2014, Jay Pipes wrote:
>>
>>> I think at that point I mentioned that there were a number of places that
>>> were using the SELECT ... FOR UPDATE construct in Nova (in SQLAlchemy, it's
>>> the with_lockmode('update') modification of the query object). Peter
>>> promptly said that was a problem. MySQL Galera does not support SELECT ...
>>> FOR UPDATE, since it has no concept of cross-node locking of records and
>>> results are non-deterministic.
>>
>> So you send a command that's not supported and the whole software
>> deadlocks? Is there a bug number about that or something? I cannot
>> understand how this can be possible and considered as something normal
>> (that's the feeling I have reading your mail, I may be wrong).
>>
>>> We have a number of options:
>>>
>>> 1) Stop using MySQL Galera for databases of projects that contain
>>> with_lockmode('update')
>>>
>>> 2) Put a big old warning in the docs somewhere about the problem of
>>> potential deadlocks or odd behaviour with Galera in these projects
>>>
>>> 3) For Nova and Neutron, remove the use of with_lockmode('update') and
>>> instead use a coarse-grained file lock or a distributed lock manager for
>>> those areas where we need deterministic reads or quiescence.
>>>
>>> 4) For the Nova db quota driver, refactor the driver to either use a
>>> non-locking method for reservation and quota queries or move the driver out
>>> into its own projects (or use something like Climate and make sure that
>>> Climate uses a non-blocking algorithm for those queries...)
>>>
>>> Thoughts?
>>
>> 5) Stop leveling down our development, and rely and leverage a powerful
>> RDBMS that provides interesting feature, such as PostgreSQL.
>>
>> Sorry, had to say it, but it's pissing me off to see the low quality of
>> the work that is done around SQL in OpenStack.
>>
>> --
>> Julien Danjou
>> /* Free Software hacker
>>    http://julien.danjou.info */
>
>
>
> --
> Peter Boros, Consultant, Percona
> Telephone: +1 888 401 3401 ext 546
> Emergency: +1 888 401 3401 ext 911
> Skype: percona.pboros



-- 
Peter Boros, Consultant, Percona
Telephone: +1 888 401 3401 ext 546
Emergency: +1 888 401 3401 ext 911
Skype: percona.pboros



More information about the OpenStack-dev mailing list