[openstack-dev] [Mistral] How Mistral handling long running delegate tasks

Ivan Melnikov imelnikov at griddynamics.com
Thu Apr 3 19:04:17 UTC 2014


I'm trying to catch up this rather long and interesting discussion,
sorry for somewhat late reply.

I can see aspects of 'lazy model' support in TaskFlow:
- how tasks are executed and reverted
- how flows are run
- how engine works internally

Let me address those aspects separately.

== Executing and reverting tasks ==

I think that should be done via different interface then running a flow
(or scheduling it to run), as it is completely different thing. In
current TaskFlow this interface is called task executor:
https://github.com/openstack/taskflow/blob/master/taskflow/engines/action_engine/executor.py#L57

That is actually how our WorkerBasedEngine was implemented: it's the
same engine with special task executor that schedules tasks on worker
instead of running task code locally.

Task executors are not aware of flows by design, all they do is
executing and reverting tasks. That means that task executors can be
easily shared between engines if that's wanted.

Current TaskExecutorBase interface uses futures (PEP 3148-like). When I
proposed it, futures looked like good tool for the task at hand (see
e.g. async task etherpad
https://etherpad.openstack.org/p/async-taskflow-tasks)

Now it may be time to reconsider that: having one future object per
running task may become a scalability issue. It may be worth to use
callbacks instead. It should not be too hard to refactor current engine
for that. Also, as TaskExecutorBase is an internal API, there should not
be any compatibility issues.

Then, we can make task executor interface public and allow clients to
provide their own task executors. It will be possible then for Mistral
to implement its own task executor, or several, and share the
executors between all the engine instances.

You can call it a plan;)

== Running flows ==

To run the flow TaskFlow client uses engine interface; also, there are
few of helper functions provided for convenience:

http://docs.openstack.org/developer/taskflow/engines.html#module-taskflow.engines.base
http://docs.openstack.org/developer/taskflow/engines.html#creating-engines

That is part of our public API, it is stable and good enough. Basically,
I don't think this API needs any major change.

Maybe it worth to add function or method to schedule running flow
without actually waiting for flow completion (at least, it was on my top
secret TODO list for quite a long time).

== Engine internals ==

Each engine eats resources, like thread it runs on; using these
resources to run one flow only is somewhat wasteful. Some work is
already planned to address this situation (see e.g.
https://blueprints.launchpad.net/taskflow/+spec/share-engine-thread).
Also, it might be good idea to implement different 'type' of engine to
support 'lazy' model, as Joshua suggests.

But whatever should and will be done about it, I daresay all that work
can be done without affecting API more then I described above.

-- 
WBR,
Ivan A. Melnikov

... tasks must flow ...


On 02.04.2014 01:51, Dmitri Zimine wrote:
> Even more responses inline :)
[...]



More information about the OpenStack-dev mailing list