[openstack-dev] [Heat] Locking and ZooKeeper - a space oddysey

Joshua Harlow harlowja at yahoo-inc.com
Wed Oct 30 18:23:16 UTC 2013


For taskflow usage/details: https://wiki.openstack.org/wiki/TaskFlow

Let me know if the documentation there is not sufficient about defining
who its useful so that I can make it more clear (to resolve the 'versed on
TaskFlow's details' part). As I have tried to define the use-cases that it
can help solve (in fact the engine design u just described as a desired
thing for heat, it is doing, saving _all_ of its state in a distributed
store - a database right now) so there are more similarities in what u
_want_ and what taskflow actually has in its 0.1 release
(https://pypi.python.org/pypi/taskflow)

More documentation related to this:

- https://wiki.openstack.org/wiki/TaskFlow/Persistence (*)
- https://wiki.openstack.org/wiki/TaskFlow/Engines
- https://wiki.openstack.org/wiki/TaskFlow/Inputs_and_Outputs (*)

So let me know if the docs there do not describe in the detail u
want/desire and I can help there.

As for actual usage, since taskflow is ~5.2 months old (about the same age
as the heat engine code) your concern about usage is valid and I am
working at the summit to spread awareness and gain more usage (ongoing
work is happening in nova as we speak, cinder has a version in havana that
is being used for its complete create_volume workflow - one of the key
workflows there). So that¹s tremendous progress imho, to have a library
like taskflow have a 'stable' 0.1 version as well as get usage in havana
(while the library itself was being created).

That¹s amazing to me and I and others are pretty proud of that :-)

Feel free to join in some of the HK sessions that I will have.

Design sessions:

- 
http://icehousedesignsummit.sched.org/event/1ec7d73aba03ad0b95bd8de631c623c
b#.Um83SxBWlgc
- 
http://icehousedesignsummit.sched.org/event/ced7d22ac4c037f102b3cf3ade55310
4#.Um83YxBWlgc
- 
http://icehousedesignsummit.sched.org/event/c31c81de71c25333b876b0da2f430f5
0#.Um83bRBWlgc
- 
http://icehousedesignsummit.sched.org/event/5fc501fadd4faed52556ed700c39e5f
2#.Um83dhBWlgc

Speaker sessions:

- 
http://openstacksummitnovember2013.sched.org/event/29f1f996b36aaf0febc5d43b
6f53f2a4#.Um83phBWlgc


On 10/30/13 10:42 AM, "Clint Byrum" <clint at fewbar.com> wrote:

>So, recently we've had quite a long thread in gerrit regarding locking
>in Heat:
>
>https://review.openstack.org/#/c/49440/
>
>In the patch, there are two distributed lock drivers. One uses SQL,
>and suffers from all the problems you might imagine a SQL based locking
>system would. It is extremely hard to detect dead lock holders, so we
>end up with really long timeouts. The other is ZooKeeper.
>
>I'm on record as saying we're not using ZooKeeper. It is a little
>embarrassing to have taken such a position without really thinking things
>through. The main reason I feel this way though, is not because ZooKeeper
>wouldn't work for locking, but because I think locking is a mistake.
>
>The current multi-engine paradigm has a race condition. If you have a
>stack action going on, the state is held in the engine itself, and not
>in the database, so if another engine starts working on another action,
>they will conflict.
>
>The locking paradigm is meant to prevent this. But I think this is a
>huge mistake.
>
>The engine should store _all_ of its state in a distributed data store
>of some kind. Any engine should be aware of what is already happening
>with the stack from this state and act accordingly. That includes the
>engine currently working on actions. When viewed through this lense,
>to me, locking is a poor excuse for serializing the state of the engine
>scheduler.
>
>It feels like TaskFlow is the answer, with an eye for making sure
>TaskFlow can be made to work with distributed state. I am not well
>versed on TaskFlow's details though, so I may be wrong. It worries me
>that TaskFlow has existed a while and doesn't seem to be solving real
>problems, but maybe I'm wrong and it is actually in use already.
>
>Anyway, as a band-aid, we may _have_ to do locking. For that, ZooKeeper
>has some real advantages over using the database. But there is hesitance
>because it is not widely supported in OpenStack. What say you, OpenStack
>community? Should we keep ZooKeeper out of our.. zoo?
>
>_______________________________________________
>OpenStack-dev mailing list
>OpenStack-dev at lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




More information about the OpenStack-dev mailing list