[openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera

Sahid Orentino Ferdjaoui sahid.ferdjaoui at redhat.com
Thu Feb 5 11:11:25 UTC 2015


On Thu, Feb 05, 2015 at 09:56:21AM +0000, Matthew Booth wrote:
> On 04/02/15 19:04, Jay Pipes wrote:
> > On 02/04/2015 12:05 PM, Sahid Orentino Ferdjaoui wrote:
> >> On Wed, Feb 04, 2015 at 04:30:32PM +0000, Matthew Booth wrote:
> >>> I've spent a few hours today reading about Galera, a clustering solution
> >>> for MySQL. Galera provides multi-master 'virtually synchronous'
> >>> replication between multiple mysql nodes. i.e. I can create a cluster of
> >>> 3 mysql dbs and read and write from any of them with certain consistency
> >>> guarantees.
> >>>
> >>> I am no expert[1], but this is a TL;DR of a couple of things which I
> >>> didn't know, but feel I should have done. The semantics are important to
> >>> application design, which is why we should all be aware of them.
> >>>
> >>>
> >>> * Commit will fail if there is a replication conflict
> >>>
> >>> foo is a table with a single field, which is its primary key.
> >>>
> >>> A: start transaction;
> >>> B: start transaction;
> >>> A: insert into foo values(1);
> >>> B: insert into foo values(1); <-- 'regular' DB would block here, and
> >>>                                    report an error on A's commit
> >>> A: commit; <-- success
> >>> B: commit; <-- KABOOM
> >>>
> >>> Confusingly, Galera will report a 'deadlock' to node B, despite this not
> >>> being a deadlock by any definition I'm familiar with.
> > 
> > It is a failure to certify the writeset, which bubbles up as an InnoDB
> > deadlock error. See my article here:
> > 
> > http://www.joinfu.com/2015/01/understanding-reservations-concurrency-locking-in-nova/
> > 
> > 
> > Which explains this.
> > 
> >> Yes ! and if I can add more information and I hope I do not make
> >> mistake I think it's a know issue which comes from MySQL, that is why
> >> we have a decorator to do a retry and so handle this case here:
> >>
> >>   
> >> http://git.openstack.org/cgit/openstack/nova/tree/nova/db/sqlalchemy/api.py#n177
> >>
> > 
> > It's not an issue with MySQL. It's an issue with any database code that
> > is highly contentious.

I wanted to speak about the term "deadlock" (which also looks to
surprise Matthew) used, I though it comes from MySQL. In our situation
it's not really a deadlock, just a locked sessions from A and so B needs
to retry ?

I believe a deadlock would be when a session A tries to read something
on table x.foo to update y.bar when B tries to read something on y.bar
to update x.foo - so when A acquires a lock to read x.foo, B acquires
a lock to read y.bar, then when A needs to acquire lock to update
y.bar it can not, then same thing for B with x.foo.

> > Almost all highly distributed or concurrent applications need to handle
> > deadlock issues, and the most common way to handle deadlock issues on
> > database records is using a retry technique. There's nothing new about
> > that with Galera.
> > 
> > The issue with our use of the @_retry_on_deadlock decorator is *not*
> > that the retry decorator is not needed, but rather it is used too
> > frequently. The compare-and-swap technique I describe in the article
> > above dramatically* reduces the number of deadlocks that occur (and need
> > to be handled by the @_retry_on_deadlock decorator) and dramatically
> > reduces the contention over critical database sections.

Thanks for these informations.

> I'm still confused as to how this code got there, though. We shouldn't
> be hitting Galera lock contention (reported as deadlocks) if we're using
> a single master, which I thought we were. Does this mean either:

I guess we can hit a lock contention even in single master.

> A. There are deployments using multi-master?
> B. These are really deadlocks?
> 
> If A, is this something we need to continue to support?
> 
> Thanks,
> 
> Matt
> -- 
> Matthew Booth
> Red Hat Engineering, Virtualisation Team
> 
> Phone: +442070094448 (UK)
> GPG ID:  D33C3490
> GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list