[openstack-dev] [Heat] Locking and ZooKeeper - a space oddysey

Joshua Harlow harlowja at yahoo-inc.com
Thu Oct 31 17:10:08 UTC 2013


I'm pretty sure the cats out of the bag.

https://github.com/openstack/requirements/blob/master/global-requirements.t
xt#L29

https://kazoo.readthedocs.org/en/latest/

-Josh

On 10/31/13 7:43 AM, "Monty Taylor" <mordred at inaugust.com> wrote:

>
>
>On 10/30/2013 10:42 AM, Clint Byrum wrote:
>> So, recently we've had quite a long thread in gerrit regarding locking
>> in Heat:
>> 
>> https://review.openstack.org/#/c/49440/
>> 
>> In the patch, there are two distributed lock drivers. One uses SQL,
>> and suffers from all the problems you might imagine a SQL based locking
>> system would. It is extremely hard to detect dead lock holders, so we
>> end up with really long timeouts. The other is ZooKeeper.
>> 
>> I'm on record as saying we're not using ZooKeeper. It is a little
>> embarrassing to have taken such a position without really thinking
>>things
>> through. The main reason I feel this way though, is not because
>>ZooKeeper
>> wouldn't work for locking, but because I think locking is a mistake.
>> 
>> The current multi-engine paradigm has a race condition. If you have a
>> stack action going on, the state is held in the engine itself, and not
>> in the database, so if another engine starts working on another action,
>> they will conflict.
>> 
>> The locking paradigm is meant to prevent this. But I think this is a
>> huge mistake.
>> 
>> The engine should store _all_ of its state in a distributed data store
>> of some kind. Any engine should be aware of what is already happening
>> with the stack from this state and act accordingly. That includes the
>> engine currently working on actions. When viewed through this lense,
>> to me, locking is a poor excuse for serializing the state of the engine
>> scheduler.
>> 
>> It feels like TaskFlow is the answer, with an eye for making sure
>> TaskFlow can be made to work with distributed state. I am not well
>> versed on TaskFlow's details though, so I may be wrong. It worries me
>> that TaskFlow has existed a while and doesn't seem to be solving real
>> problems, but maybe I'm wrong and it is actually in use already.
>> 
>> Anyway, as a band-aid, we may _have_ to do locking. For that, ZooKeeper
>> has some real advantages over using the database. But there is hesitance
>> because it is not widely supported in OpenStack. What say you, OpenStack
>> community? Should we keep ZooKeeper out of our.. zoo?
>
>Yes. I'm strongly opposed to ZooKeeper finding its way into the already
>complex pile of things we use.
>
>_______________________________________________
>OpenStack-dev mailing list
>OpenStack-dev at lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




More information about the OpenStack-dev mailing list