[openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera

Jay Pipes jaypipes at gmail.com
Mon Feb 9 20:36:45 UTC 2015


On 02/09/2015 03:10 PM, Clint Byrum wrote:
> Excerpts from Jay Pipes's message of 2015-02-09 10:15:10 -0800:
>> On 02/09/2015 01:02 PM, Attila Fazekas wrote:
>>> I do not see why not to use `FOR UPDATE` even with multi-writer or
>>> Is the retry/swap way really solves anything here.
>> <snip>
>>> Am I missed something ?
>>
>> Yes. Galera does not replicate the (internal to InnnoDB) row-level locks
>> that are needed to support SELECT FOR UPDATE statements across multiple
>> cluster nodes.
>>
>> https://groups.google.com/forum/#!msg/codership-team/Au1jVFKQv8o/QYV_Z_t5YAEJ
>
> Attila acknowledged that. What Attila was saying was that by using it
> with Galera, the box that is doing the FOR UPDATE locks will simply fail
> upon commit because a conflicting commit has already happened and arrived
> from the node that accepted the write. Further what Attila is saying is
> that this means there is not such an obvious advantage to the CAS method,
> since the rollback and the # updated rows == 0 are effectively equivalent
> at this point, seeing as the prior commit has already arrived and thus
> will not need to wait to fail certification and be rolled back.

No, that is not correct. In the case of the CAS technique, the frequency 
of rollbacks due to certification failure is demonstrably less than when 
using SELECT FOR UPDATE and relying on the certification timeout error 
to signal a deadlock.

> I am not entirely certain that is true though, as I think what will
> happen in sequential order is:
>
> writer1: UPDATE books SET genre = 'Scifi' WHERE genre = 'sciencefiction';
> writer1: --> send in-progress update to cluster
> writer2: SELECT FOR UPDATE books WHERE id=3;
> writer1: COMMIT
> writer1: --> try to certify commit in cluster
> ** Here is where I stop knowing for sure what happens **
> writer2: certifies writer1's transaction or blocks?

It will certify writer1's transaction. It will only block another thread 
hitting writer2 requesting write locks or write-intent read locks on the 
same records.

> writer2: UPDATE books SET genre = 'sciencefiction' WHERE id=3;
> writer2: COMMIT --> One of them is rolled back.
>
> So, at that point where I'm not sure (please some Galera expert tell
> me):
>
> If what happens is as I suggest, writer1's transaction is certified,
> then that just means the lock sticks around blocking stuff on writer2,
> but that the data is updated and it is certain that writer2's commit will
> be rolled back. However, if it blocks waiting on the lock to resolve,
> then I'm at a loss to determine which transaction would be rolled back,
> but I am thinking that it makes sense that the transaction from writer2
> would be rolled back, because the commit is later.

That is correct. writer2's transaction would be rolled back. The 
difference is that the CAS method would NOT trigger a ROLLBACK. It would 
instead return 0 rows affected, because the UPDATE statement would 
instead look like this:

UPDATE books SET genre = 'sciencefiction' WHERE id = 3 AND genre = 'SciFi';

And the return of 0 rows affected would trigger a simple retry of the 
read and then update attempt on writer2 instead of dealing with ROLLBACK 
semantics on the transaction.

Note that in the CAS method, the SELECT statement and the UPDATE are in 
completely different transactions. This is a very important thing to 
keep in mind.

> All this to say that usually the reason for SELECT FOR UPDATE is not
> to only do an update (the transactional semantics handle that), but
> also to prevent the old row from being seen again, which, as Jay says,
> it cannot do.  So I believe you are both correct:
>
> * Attila, yes I think you're right that CAS is not any more efficient
> at replacing SELECT FOR UPDATE from a blocking standpoint.

It is more efficient because there are far fewer ROLLBACKs of 
transactions occurring in the system.

If you look at a slow query log (with a 0 slow query time) for a MySQL 
Galera server in a multi-write cluster during a run of Tempest or Rally, 
you will notice that the number of ROLLBACK statements is extraordinary. 
AFAICR, when Peter Boros and I benchmarked a Rally launch and delete 10K 
VM run, we saw nearly 11% of *total* queries executed against the server 
were ROLLBACKs. This, in my opinion, is the main reason that the CAS 
method will show as more efficient.

> * Jay, yes I think you're right that SELECT FOR UPDATE is not the right
> thing to use to do such reads, because one is relying on locks that are
> meaningless on a Galera cluster.
>
> Where I think the CAS ends up being the preferred method for this sort
> of thing is where one consideres that it won't hold a meaningless lock
> while the transaction is completed and then rolled back.

CAS is preferred because it is measurably faster and more 
obstruction-free than SELECT FOR UPDATE. A colleague of mine is almost 
ready to publish documentation showing a benchmark of this that shows 
nearly a 100% decrease in total amount of lock/wait time using CAS 
versus waiting for the coarser-level certification timeout to retry the 
transactions. As mentioned above, I believe this is due to the dramatic 
decrease in ROLLBACKs.

Best,
-jay



More information about the OpenStack-dev mailing list