[openstack-dev] [Heat] Locking and ZooKeeper - a space oddysey

Monty Taylor mordred at inaugust.com
Thu Oct 31 14:43:52 UTC 2013



On 10/30/2013 10:42 AM, Clint Byrum wrote:
> So, recently we've had quite a long thread in gerrit regarding locking
> in Heat:
> 
> https://review.openstack.org/#/c/49440/
> 
> In the patch, there are two distributed lock drivers. One uses SQL,
> and suffers from all the problems you might imagine a SQL based locking
> system would. It is extremely hard to detect dead lock holders, so we
> end up with really long timeouts. The other is ZooKeeper.
> 
> I'm on record as saying we're not using ZooKeeper. It is a little
> embarrassing to have taken such a position without really thinking things
> through. The main reason I feel this way though, is not because ZooKeeper
> wouldn't work for locking, but because I think locking is a mistake.
> 
> The current multi-engine paradigm has a race condition. If you have a
> stack action going on, the state is held in the engine itself, and not
> in the database, so if another engine starts working on another action,
> they will conflict.
> 
> The locking paradigm is meant to prevent this. But I think this is a
> huge mistake.
> 
> The engine should store _all_ of its state in a distributed data store
> of some kind. Any engine should be aware of what is already happening
> with the stack from this state and act accordingly. That includes the
> engine currently working on actions. When viewed through this lense,
> to me, locking is a poor excuse for serializing the state of the engine
> scheduler.
> 
> It feels like TaskFlow is the answer, with an eye for making sure
> TaskFlow can be made to work with distributed state. I am not well
> versed on TaskFlow's details though, so I may be wrong. It worries me
> that TaskFlow has existed a while and doesn't seem to be solving real
> problems, but maybe I'm wrong and it is actually in use already.
> 
> Anyway, as a band-aid, we may _have_ to do locking. For that, ZooKeeper
> has some real advantages over using the database. But there is hesitance
> because it is not widely supported in OpenStack. What say you, OpenStack
> community? Should we keep ZooKeeper out of our.. zoo?

Yes. I'm strongly opposed to ZooKeeper finding its way into the already
complex pile of things we use.



More information about the OpenStack-dev mailing list