[openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera

Matthew Booth mbooth at redhat.com
Thu Feb 5 09:36:55 UTC 2015


On 04/02/15 17:05, Sahid Orentino Ferdjaoui wrote:
>> * Commit will fail if there is a replication conflict
>>
>> foo is a table with a single field, which is its primary key.
>>
>> A: start transaction;
>> B: start transaction;
>> A: insert into foo values(1);
>> B: insert into foo values(1); <-- 'regular' DB would block here, and
>>                                   report an error on A's commit
>> A: commit; <-- success
>> B: commit; <-- KABOOM
>>
>> Confusingly, Galera will report a 'deadlock' to node B, despite this not
>> being a deadlock by any definition I'm familiar with.
> 
> Yes ! and if I can add more information and I hope I do not make
> mistake I think it's a know issue which comes from MySQL, that is why
> we have a decorator to do a retry and so handle this case here:
> 
>   http://git.openstack.org/cgit/openstack/nova/tree/nova/db/sqlalchemy/api.py#n177

Right, and that remains a significant source of confusion and
obfuscation in the db api. Our db code is littered with races and
potential actual deadlocks, but only some functions are decorated. Are
they decorated because of real deadlocks, or because of Galera lock
contention? The solutions to those 2 problems are very different! Also,
hunting deadlocks is hard enough work. Adding the possibility that they
might not even be there is just evil.

Incidentally, we're currently looking to replace this stuff with some
new code in oslo.db, which is why I'm looking at it.

Matt
-- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490



More information about the OpenStack-dev mailing list