[openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera

Matthew Booth mbooth at redhat.com
Thu Feb 5 09:56:21 UTC 2015


On 04/02/15 19:04, Jay Pipes wrote:
> On 02/04/2015 12:05 PM, Sahid Orentino Ferdjaoui wrote:
>> On Wed, Feb 04, 2015 at 04:30:32PM +0000, Matthew Booth wrote:
>>> I've spent a few hours today reading about Galera, a clustering solution
>>> for MySQL. Galera provides multi-master 'virtually synchronous'
>>> replication between multiple mysql nodes. i.e. I can create a cluster of
>>> 3 mysql dbs and read and write from any of them with certain consistency
>>> guarantees.
>>>
>>> I am no expert[1], but this is a TL;DR of a couple of things which I
>>> didn't know, but feel I should have done. The semantics are important to
>>> application design, which is why we should all be aware of them.
>>>
>>>
>>> * Commit will fail if there is a replication conflict
>>>
>>> foo is a table with a single field, which is its primary key.
>>>
>>> A: start transaction;
>>> B: start transaction;
>>> A: insert into foo values(1);
>>> B: insert into foo values(1); <-- 'regular' DB would block here, and
>>>                                    report an error on A's commit
>>> A: commit; <-- success
>>> B: commit; <-- KABOOM
>>>
>>> Confusingly, Galera will report a 'deadlock' to node B, despite this not
>>> being a deadlock by any definition I'm familiar with.
> 
> It is a failure to certify the writeset, which bubbles up as an InnoDB
> deadlock error. See my article here:
> 
> http://www.joinfu.com/2015/01/understanding-reservations-concurrency-locking-in-nova/
> 
> 
> Which explains this.
> 
>> Yes ! and if I can add more information and I hope I do not make
>> mistake I think it's a know issue which comes from MySQL, that is why
>> we have a decorator to do a retry and so handle this case here:
>>
>>   
>> http://git.openstack.org/cgit/openstack/nova/tree/nova/db/sqlalchemy/api.py#n177
>>
> 
> It's not an issue with MySQL. It's an issue with any database code that
> is highly contentious.
> 
> Almost all highly distributed or concurrent applications need to handle
> deadlock issues, and the most common way to handle deadlock issues on
> database records is using a retry technique. There's nothing new about
> that with Galera.
> 
> The issue with our use of the @_retry_on_deadlock decorator is *not*
> that the retry decorator is not needed, but rather it is used too
> frequently. The compare-and-swap technique I describe in the article
> above dramatically* reduces the number of deadlocks that occur (and need
> to be handled by the @_retry_on_deadlock decorator) and dramatically
> reduces the contention over critical database sections.

I'm still confused as to how this code got there, though. We shouldn't
be hitting Galera lock contention (reported as deadlocks) if we're using
a single master, which I thought we were. Does this mean either:

A. There are deployments using multi-master?
B. These are really deadlocks?

If A, is this something we need to continue to support?

Thanks,

Matt
-- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490



More information about the OpenStack-dev mailing list