[openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera

Gregory Haynes greg at greghaynes.net
Fri Feb 6 02:55:51 UTC 2015


Excerpts from Angus Lees's message of 2015-02-06 02:36:32 +0000:
> On Fri Feb 06 2015 at 12:59:13 PM Gregory Haynes <greg at greghaynes.net>
> wrote:
> 
> > Excerpts from Joshua Harlow's message of 2015-02-06 01:26:25 +0000:
> > > Angus Lees wrote:
> > > > On Fri Feb 06 2015 at 4:25:43 AM Clint Byrum <clint at fewbar.com
> > > > <mailto:clint at fewbar.com>> wrote:
> > > >     I'd also like to see consideration given to systems that handle
> > > >     distributed consistency in a more active manner. etcd and
> > Zookeeper are
> > > >     both such systems, and might serve as efficient guards for critical
> > > >     sections without raising latency.
> > > >
> > > >
> > > > +1 for moving to such systems.  Then we can have a repeat of the above
> > > > conversation without the added complications of SQL semantics ;)
> > > >
> > >
> > > So just an fyi:
> > >
> > > http://docs.openstack.org/developer/tooz/ exists.
> > >
> > > Specifically:
> > >
> > > http://docs.openstack.org/developer/tooz/developers.
> > html#tooz.coordination.CoordinationDriver.get_lock
> > >
> > > It has a locking api that it provides (that plugs into the various
> > > backends); there is also a WIP https://review.openstack.org/#/c/151463/
> > > driver that is being worked for etc.d.
> > >
> >
> > An interesting note about the etcd implementation is that you can
> > select per-request whether you want to wait for quorum on a read or not.
> > This means that in theory you could obtain higher throughput for most
> > operations which do not require this and then only gain quorum for
> > operations which require it (e.g. locks).
> >
> 
> Along those lines and in an effort to be a bit less doom-and-gloom, I spent
> my lunch break trying to find non-marketing documentation on the Galera
> replication protocol and how it is exposed. (It was surprisingly difficult
> to find such information *)
> 
> It's easy to get the transaction ID of the last commit
> (wsrep_last_committed), but I can't find a way to wait until at least a
> particular transaction ID has been synced.  If we can find that latter
> functionality, then we can expose that sequencer all the way through (HTTP
> header?) and then any follow-on commands can mention the sequencer of the
> previous write command that they really need to see the effects of.
> 
> In practice, this should lead to zero additional wait time, since the
> Galera replication has almost certainly already caught up by the time the
> second command comes in - and we can just read from the local server with
> no additional delay.
> 
> See the various *Index variables in the etcd API, for how the same idea
> gets used there.
> 
>  - Gus
> 
> (*) In case you're also curious, the only doc I found with any details was
> http://galeracluster.com/documentation-webpages/certificationbasedreplication.html
> and its sibling pages.

My fear with something like this is that this is already a very hard
problem to get correct and this would be adding a fair amount of
complexity client side to achieve this. There is also an issue in that
this would a gelera-specific solution which means well be adding another
dimension to our feature testing matrix if we really wanted to support
it.

IMO we *really* do not want to be in the business of writing distrubuted
locking systems, but rather should be finding a way to either not
require them or rely on existing solutions.



More information about the OpenStack-dev mailing list