[openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera

Clint Byrum clint at fewbar.com
Wed Feb 4 22:04:29 UTC 2015


Excerpts from Joshua Harlow's message of 2015-02-04 13:24:20 -0800:
> How interesting,
> 
> Why are people using galera if it behaves like this? :-/
> 

Note that any true MVCC database will roll back transactions on
conflicts. One must always have a deadlock detection algorithm of
some kind.

Galera behaves like this because it is enormously costly to be synchronous
at all times for everything. So it is synchronous when you want it to be,
and async when you don't.

Note that it's likely NDB (aka "MySQL Cluster") would work fairly well
for OpenStack's workloads, and does not suffer from this. However, it
requires low latency high bandwidth links between all nodes (infiniband
recommended) or it will just plain suck. So Galera is a cheaper, easier
to tune and reason about option.

> Are the people that are using it know/aware that this happens? :-/
> 

I think the problem really is that it is somewhat de facto, and used
without being tested. The gate doesn't set up a three node Galera db and
test that OpenStack works right. Also it is inherently a race condition,
and thus will be a hard one to test.

Thats where having knowledge of it and taking time to engineer a
solution that makes sense is really the best course I can think of.



More information about the OpenStack-dev mailing list