[openstack-dev] [nova][neutron][mysql] IMPORTANT: MySQL Galera does *not* support SELECT ... FOR UPDATE

Peter Boros peter.boros at percona.com
Tue May 20 12:27:26 UTC 2014


Hi,

I would like to shed some additional light on this for those who were
not there. So, SELECT ... FOR UPDATE does lock on a single node, as it
is pointed out earlier in this thread, a simple solution is to write
only one node at a time. Haproxy can be set up with both backends, see
this blog post for example.
http://www.mysqlperformanceblog.com/2012/06/20/percona-xtradb-cluster-reference-architecture-with-haproxy/

In a nutshell, and with a bit of an oversimplification, galera
replicates in write sets. A write set is practically a row based
binary log event + some metadata which is good for 2 things: you can
take a look at 2 write sets and tell if they are conflicting or not,
and you can take a look at a writeset and a database, and tell if the
write set is applicable to the database. At the time of commit, the
transaction is transferred to all the other cluster nodes in parallel.
On the remote node, the new transaction is compared to each other
transaction waiting in the queue to be applied, and it's checked if
it's applicable to the database. If the transaction if not
conflicting, and it's applicable, it's queued, and the node signals
back that the commit can proceed. There is a nice drawing about this
here:

http://www.percona.com/doc/percona-xtradb-cluster/5.6/features/multimaster-replication.html

So, because of this, the locks of SELECT FOR UPDATE won't replicate.
Between nodes, galera uses optimistic locking. This means that we
assume that during the certification process (described above), there
will be no conflicts. If there are conflicts, the transaction is
rolled back on the originating node, and this is when you receive the
error message in question. A failed transaction is something which can
happen any time with any database engine with any "interesting
feature", and when a transaction failed, the application should now
what to do with it. In case of galera, a conflict in a single node
case was a wait on row locks, in case of galera replication, this will
be a rollback. A rollback is a much more expensive operation (data has
to be copied back from undo), so if there are lots of failures like
this, performance will suffer.
So, this is not a deadlock in the classical sense. Yet, InnoDB can
roll back a transaction any time because of a deadlock (any database
engine can do that, including PostgreSQL), and the application should
be able to handle this.

As it was noted earlier, writing to a single node only at a time is a
good solution for avoiding this. With multiple nodes written, storage
engine level writes will still happen on every node, because every
node has the whole data set. Writing on multiple nodes can be
beneficial because parsing SQL is much more expensive than just
applying a row based binary log event, so you can see some performance
improvement if all nodes are written.

I would discourage using any type of multi-master replication without
understanding how conflict resolution works in case of the chosen
solution. In case of galera, if row locks were replicated over the
network, it would act the same way as a single server, but it would be
really slow. If SELECT FOR UPDATE is only used to achieve consistent
reads (read your own writes), that can be achieved with
wsrep_causal_reads. I am happy to help to avoid SELECT FOR UPDATE if
somebody can tell me the use cases.

On Tue, May 20, 2014 at 10:53 AM, Julien Danjou <julien at danjou.info> wrote:
> On Mon, May 19 2014, Jay Pipes wrote:
>
>> I think at that point I mentioned that there were a number of places that
>> were using the SELECT ... FOR UPDATE construct in Nova (in SQLAlchemy, it's
>> the with_lockmode('update') modification of the query object). Peter
>> promptly said that was a problem. MySQL Galera does not support SELECT ...
>> FOR UPDATE, since it has no concept of cross-node locking of records and
>> results are non-deterministic.
>
> So you send a command that's not supported and the whole software
> deadlocks? Is there a bug number about that or something? I cannot
> understand how this can be possible and considered as something normal
> (that's the feeling I have reading your mail, I may be wrong).
>
>> We have a number of options:
>>
>> 1) Stop using MySQL Galera for databases of projects that contain
>> with_lockmode('update')
>>
>> 2) Put a big old warning in the docs somewhere about the problem of
>> potential deadlocks or odd behaviour with Galera in these projects
>>
>> 3) For Nova and Neutron, remove the use of with_lockmode('update') and
>> instead use a coarse-grained file lock or a distributed lock manager for
>> those areas where we need deterministic reads or quiescence.
>>
>> 4) For the Nova db quota driver, refactor the driver to either use a
>> non-locking method for reservation and quota queries or move the driver out
>> into its own projects (or use something like Climate and make sure that
>> Climate uses a non-blocking algorithm for those queries...)
>>
>> Thoughts?
>
> 5) Stop leveling down our development, and rely and leverage a powerful
> RDBMS that provides interesting feature, such as PostgreSQL.
>
> Sorry, had to say it, but it's pissing me off to see the low quality of
> the work that is done around SQL in OpenStack.
>
> --
> Julien Danjou
> /* Free Software hacker
>    http://julien.danjou.info */



-- 
Peter Boros, Consultant, Percona
Telephone: +1 888 401 3401 ext 546
Emergency: +1 888 401 3401 ext 911
Skype: percona.pboros



More information about the OpenStack-dev mailing list