[openstack-dev] [nova][neutron][mysql] IMPORTANT: MySQL Galera does *not* support SELECT ... FOR UPDATE

Robert Collins robertc at robertcollins.net
Thu May 29 19:12:24 UTC 2014


I just bent Jay's ear on IRC about this for a bit...

On 21 May 2014 05:07, Jay Pipes <jaypipes at gmail.com> wrote:

>> We are one of those operators that use Galera for replicating our mysql
>> databases. We used to  see issues with deadlocks when having multiple
>> mysql writers in our mysql cluster. As a workaround we have our haproxy
>> configuration in an active-standby configuration for our mysql VIP.
>>
>> I seem to recall we had a lot of the deadlocks happen through Neutron.
>> When we go through our Icehouse testing, we will redo our multimaster
>> mysql setup and provide feedback on the issues we see.
>
>
> Thanks very much, Sridar, much appreciated.
>
> This issue was raised at the Neutron IRC meeting yesterday, and we've agreed
> to take a staged approach. We will first work on documentation to add to the
> operations guide that explains the issues (and the tradeoffs of going to a
> single-writer cluster configuration vs. just having the clients retry some
> request). Later stages will work on a non-locking quota-management
> algorithm, possibly in conjunction with Climate, and looking into how to use
> coarser-grained file locks or a distributed lock manager for handling
> cross-component deterministic reads in Neutron.

So - correct my if I've (still :)) got it wrong, but there are two
orthogonal issues here:

a) conflicts in SQL are a normal fact of life - even with SELECT FOR
UPDATE on a single-node MySQL deployment. There is a standard
signalling mechanism for them, and the Galera behaviour here is
in-spec. It differs from the single-node situation in only two ways:
 1) It *always* happens when one client COMMITs rather than sometimes
happening on the SELECT FOR UPDATE and sometimes on COMMIT
 2) It happens to all clients implicated in the data being replicated,
rather than just the unlucky schmuck who came along second
It is worth calling out that the DB itself remains atomic and
consistent - there is no data integrity issue at the RDBMS layer.

b) SELECT FOR UPDATE makes us see more conflicts, but see (a) -
conflicts are a normal part of using a SQL storage layer.

SO while I'm keen to see us reduce the frequency with which we trigger
replication conflicts in Galera, I'd like to see the staged approach
be:

A) Documentation
B) Fix / add retry support pervasively through both Neutron and
OpenStack as a whole. Its baseline sanity for SQL usage
C) Implement more sophisticated schemas/update logic to
reduce/eliminate SELECT FOR UPDATE.

C seems like substantially more review and design work than B, while B
isn't 'easy' its *still necessary to be correct*.

-Rob

-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Converged Cloud



More information about the OpenStack-dev mailing list