[openstack-dev] [Neutron] db-level locks, non-blocking algorithms, active/active DB clusters and IPAM

Clint Byrum clint at fewbar.com
Wed Feb 25 13:40:29 UTC 2015


Excerpts from Salvatore Orlando's message of 2015-02-23 04:07:38 -0800:
> Lazy-Stacker summary:
> I am doing some work on Neutron IPAM code for IP Allocation, and I need to
> found whether it's better to use db locking queries (SELECT ... FOR UPDATE)
> or some sort of non-blocking algorithm.
> Some measures suggest that for this specific problem db-level locking is
> more efficient even when using multi-master DB clusters, which kind of
> counters recent findings by other contributors [2]... but also backs those
> from others [7].
> 

Thanks Salvatore, the story and data you produced is quite interesting.

> 
> With the test on the Galera cluster I was expecting a terrible slowdown in
> A-1 because of deadlocks caused by certification failures. I was extremely
> disappointed that the slowdown I measured however does not make any of the
> other algorithms a viable alternative.
> On the Galera cluster I did not run extensive collections for A-2. Indeed
> primary key violations seem to triggers db deadlock because of failed write
> set certification too (but I have not yet tested this).
> I run tests with 10 threads on each node, for a total of 30 workers. Some
> results are available at [15]. There was indeed a slow down in A-1 (about
> 20%), whereas A-3 performance stayed pretty much constant. Regardless, A-1
> was still at least 3 times faster than A-3.
> As A-3's queries are mostly select (about 75% of them) use of caches might
> make it a lot faster; also the algorithm is probably inefficient and can be
> optimised in several areas. Still, I suspect it can be made faster than
> A-1. At this stage I am leaning towards adoption db-level-locks with
> retries for Neutron's IPAM. However, since I never trust myself, I wonder
> if there is something important that I'm neglecting and will hit me down
> the road.
> 

The thing is, nobody should actually be running blindly with writes
being sprayed out to all nodes in a Galera cluster. So A-1 won't slow
down _at all_ if you just use Galera as an ACTIVE/PASSIVE write master.
It won't scale any worse for writes, since all writes go to all nodes
anyway. For reads we can very easily start to identify hot-spot reads
that can be sent to all nodes and are tolerant of a few seconds latency.

> In the medium term, there are a few things we might consider for Neutron's
> "built-in IPAM".
> 1) Move the allocation logic out of the driver, thus making IPAM an
> independent service. The API workers will then communicate with the IPAM
> service through a message bus, where IP allocation requests will be
> "naturally serialized"

This would rely on said message bus guaranteeing ordered delivery. That
is going to scale far worse, and be more complicated to maintain, than
Galera with a few retries on failover.

> 2) Use 3-party software as dogpile, zookeeper but even memcached to
> implement distributed coordination. I have nothing against it, and I reckon
> Neutron can only benefit for it (in case you're considering of arguing that
> "it does not scale", please also provide solid arguments to support your
> claim!). Nevertheless, I do believe API request processing should proceed
> undisturbed as much as possible. If processing an API requests requires
> distributed coordination among several components then it probably means
> that an asynchronous paradigm is more suitable for that API request.
> 

If we all decide that having a load balancer sending all writes and
reads to one Galera node is not acceptable for some reason, then we
should consider a distributed locking method that might scale better,
like ZK/etcd or the like. But I think just figuring out why we want to
send all writes and reads to all nodes is a better short/medium term
goal.



More information about the OpenStack-dev mailing list