[openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera

Matthew Booth mbooth at redhat.com
Thu Feb 5 10:47:16 UTC 2015


On 05/02/15 04:30, Mike Bayer wrote:
>> Galera doesn't change anything here. I'm really not sure what the
>> fuss is about, frankly.
> 
> because we’re trying to get Galera to actually work as a load
> balanced cluster to some degree, at least for reads.

Yeah, the use case of concern here is consecutive RPC transactions from
a single remote client, which can't reasonably be in the same
transaction. This affects semantics visible to the end-user.

In Nova, they might do:

$ nova aggregate-create ...
$ nova aggregate-details ...

Should they expect that the second command might fail if they don't
pause long enough between the 2? Should they retry until it succeeds?
This example is a toy, but I would expect to find many other more subtle
examples.

> Otherwise I’m not really sure why we have to bother with Galera at
> all.  If we just want a single MySQL server that has a warm standby
> for failover, why aren’t we just using that capability straight from
> MySQL.  Then we get “SELECT FOR UPDATE” and everything else back.

Actually I think this is a misconception. If I have understood
correctly[1], Galera *does* work with select for update. Use of select
for update on a single node will work exactly as normal with blocking
behaviour. Use of select for update across 2 nodes will not block, but
fail on commit if there was lock contention.

> Galera’s “multi master” capability is already in the trash for us,
> and it seems like “multi-slave” is only marginally useful either, the
> vast majority of openstack has to be 100% pointed at just one node to
> work correctly.

It's not necessarily in the trash, but given that the semantics are
different (fail on commit rather than block) we'd need to do more work
to support them. It sounds to me that we want to defer that rather than
try to fix it now. i.e. multi-master is currently unsupport(ed|able).

We could add an additional decorator to enginefacade which would
re-execute a @writer block if it detected Galera lock contention.
However, given that we'd have to audit that code for other side-effects,
for the moment it sounds like it's safer to fail.

Matt

[1] Standard caveats apply.
-- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490



More information about the OpenStack-dev mailing list