[openstack-dev] [Mistral] How Mistral handling long running delegate tasks

Renat Akhmerov rakhmerov at mirantis.com
Wed Apr 2 04:30:37 UTC 2014


On 02 Apr 2014, at 05:45, Joshua Harlow <harlowja at yahoo-inc.com> wrote:

> Possibly, although to me this is still exposing the internal of engines to users who shouldn't care (or only care if they are specifying an engine type that gives them access to these details). Allowing public access to these API's worries me in that they are now the API (which goes back to having an engine type that exposes these, if that’s desired, and if we are willing to accept the consequences of exposing them).

Given all we discussed by now calling it “internal of engines” is not correct anymore. If for some cases we know that only workers will be calling this API method and we need to protect the workflow execution from occasional calls from 3rd parties I believe there’s a million ways how to solve this. The simplest thing that comes to my mind is just passing a generated token to confirm authority to perform this operation.

>>> Who responds to the timeout though? Isn't that process the watchdog then? Likely the triggering of a timeout causes something to react (in both cases).
>> The workflow engine IS this watch-dog (and in Mistral, engine is a single manager for all flow executions, in the prototype we call it engine_manager and I hate this name :)) Engine live in process or as a separate process. And it is passive - executed in a thread of a callee. E.g., in process is either called on messaging event handler thread, or by web method thread.
> 
> 
> Right, which in mistral is a web-server right (aka, wherever mistral is setup) since the tasks finish by calling a rest-endpoint (or something sitting on MQ?)?

Not exactly right. Currently it’s a web server but we’re about to decouple engine and API server. Most of the work is done. Engine is supposed to listen to a queue and there may be any number of engines since they are stateless and hence what’s behind a web server can be scaled as needed. And actually a web server tier can be scaled easily too (assuming we have a loadbalancer in place).

>> That may be good enough: when DSL is translated to flow, and the task demands repetition with timeout, it's ok to do this trick under the hood when compiling a flow. 
>> flow.add(LinearFlow("subblahblah", retry=XYZ).add(OneTask().add(Timeout(timeout))
> 
> 
> Yup, in a way most languages compilers do all these types of tricks under the hood (in much much more complicated manners); as long as we retain 'user intention' (aka don't mess how the code executes) we should be able to do any tricks we want (in fact most compliers do many many tricks). To me the same kind of tricks start to become possible after we get the basics right (can't do optimizations, aka -O2, if u don't have basics in the first place).

I’d be careful about this assumption that we can convert DSL to flow, right now it’s impossible since we need to add more control flow primitives in TaskFlow. But that’s what Kirill described in the prototype description.

Renat Akhmerov
@ Mirantis Inc.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140402/a61b36e2/attachment.html>


More information about the OpenStack-dev mailing list