[openstack-dev] [Cinder] A possible solution for HA Active-Active

Joshua Harlow harlowja at outlook.com
Fri Jul 31 23:12:33 UTC 2015

Clint Byrum wrote:
> Excerpts from Mike Perez's message of 2015-07-31 10:40:04 -0700:
>> On Fri, Jul 31, 2015 at 8:56 AM, Joshua Harlow<harlowja at outlook.com>  wrote:
>>> ...random thought here, skip as needed... in all honesty orchestration
>>> solutions like mesos
>>> (http://mesos.apache.org/assets/img/documentation/architecture3.jpg),
>>> map-reduce solutions like hadoop, stream processing systems like apache
>>> storm (...), are already using zookeeper and I'm not saying we should just
>>> use it cause they are, but the likelihood that they just picked it for no
>>> reason are imho slim.
>> I'd really like to see focus cross project. I don't want Ceilometer to
>> depend on Zoo Keeper, Cinder to depend on etcd, etc. This is not ideal
>> for an operator to have to deploy, learn and maintain each of these
>> solutions.
>> I think this is difficult when you consider everyone wants options of
>> their preferred DLM. If we went this route, we should pick one.
>> Regardless, I want to know if we really need a DLM. Does Ceilometer
>> really need a DLM? Does Cinder really need a DLM? Can we just use a
>> hash ring solution where operators don't even have to know or care
>> about deploying a DLM and running multiple instances of Cinder manager
>> just works?
> So in the Ironic case, if two conductors decide they both own one IPMI
> controller, _chaos_ can ensue. They may, at different times, read that
> the power is up, or down, and issue power control commands that may take
> many seconds, and thus on the next status run of the other command may
> cause the conductor to react by reversing, and they'll just fight over
> the node in a tug-o-war fashion.
> Oh wait, except, thats not true. Instead, they use the database as a
> locking mechanism, and AFAIK, no nodes have been torn limb from limb by
> two conductors thus far.
> But, a DLM would be more efficient, and actually simplify failure
> recovery for Ironic's operators. The database locks suffer from being a
> little too conservative, and sometimes you just have to go into the DB
> and delete a lock after something explodes (this was true 6 months ago,
> it may have better automation sometimes now, I don't know).

A point of data, using kazoo, and zk-shell (python library and python 
zookeeper shell like interface), just to show how much introspection can 
be done with zookeeper when a kazoo lock is created (tooz locks when 
used with zookeeper use this same/similar code).

(session #1)

 >>> from kazoo import client
 >>> c = client.KazooClient()
 >>> c.start()
 >>> lk = c.Lock()
 >>> lk = c.Lock('/resourceX')
 >>> lk.acquire()

(session #2)

$ zk-shell
Welcome to zk-shell (1.1.0)
(DISCONNECTED) /> connect
(DISCONNECTED) /> connect localhost:2181
(CONNECTED) /> ls /resourceX
(CONNECTED) /> stat 
(CONNECTED) /> stat /resourceX/

### back to session #1 lk.release() lock in first session

(CONNECTED) /> ls /resourceX/


The above shows creation times, who is waiting on the lock, modification 
times, the owner.... Anyways I digress, if anyone really wants to know 
more about zookeeper let me know or drop into the #zookeeper channel on 
freenode (I'm one of the core maintainers of kazoo).


> Anyway, I'm all for the simplest possible solution. But, don't make it
> _too_ simple.
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

More information about the OpenStack-dev mailing list