[openstack-dev] [Mistral][TaskFlow] Long running actions

Stan Lagun slagun at mirantis.com
Fri Mar 21 09:23:19 UTC 2014


Don't forget HA issues. Mistral can be restarted at any moment and need to
be able to proceed from the place it was interrupted on another instance.
In theory it can be addressed by TaskFlow but I'm not sure it can be done
without complete redesign of it


On Fri, Mar 21, 2014 at 8:33 AM, W Chan <m4d.coder at gmail.com> wrote:

> Can the long running task be handled by putting the target task in the
> workflow in a persisted state until either an event triggers it or timeout
> occurs?  An event (human approval or trigger from an external system) sent
> to the transport will rejuvenate the task.  The timeout is configurable by
> the end user up to a certain time limit set by the mistral admin.
>
> Based on the TaskFlow examples, it seems like the engine instance managing
> the workflow will be in memory until the flow is completed.  Unless there's
> other options to schedule tasks in TaskFlow, if we have too many of these
> workflows with long running tasks, seems like it'll become a memory issue
> for mistral...
>
>
> On Thu, Mar 20, 2014 at 3:07 PM, Dmitri Zimine <dz at stackstorm.com> wrote:
>
>>
>> For the 'asynchronous manner' discussion see http://tinyurl.com/n3v9lt8;
>> I'm still not sure why u would want to make is_sync/is_async a primitive
>> concept in a workflow system, shouldn't this be only up to the entity
>> running the workflow to decide? Why is a task allowed to be sync/async,
>> that has major side-effects for state-persistence, resumption (and to me is
>> a incorrect abstraction to provide) and general workflow execution control,
>> I'd be very careful with this (which is why I am hesitant to add it without
>> much much more discussion).
>>
>>
>> Let's remove the confusion caused by "async". All tasks [may] run async
>> from the engine standpoint, agreed.
>>
>> "Long running tasks" - that's it.
>>
>> Examples: wait_5_days, run_hadoop_job, take_human_input.
>> The Task doesn't do the job: it delegates to an external system. The flow
>> execution needs to wait  (5 days passed, hadoob job finished with data x,
>> user inputs y), and than continue with the received results.
>>
>> The requirement is to survive a restart of any WF component without
>> loosing the state of the long running operation.
>>
>> Does TaskFlow already have a way to do it? Or ongoing ideas,
>> considerations? If yes let's review. Else let's brainstorm together.
>>
>> I agree,
>>
>> that has major side-effects for state-persistence, resumption (and to me
>> is a incorrect abstraction to provide) and general workflow execution
>> control, I'd be very careful with this
>>
>> But these requirement  comes from customers'  use cases: wait_5_day -
>> lifecycle management workflow, long running external system - Murano
>> requirements, user input - workflow for operation automations with control
>> gate checks, provisions which require 'approval' steps, etc.
>>
>> DZ>
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>


-- 
Sincerely yours
Stanislav (Stan) Lagun
Senior Developer
Mirantis
35b/3, Vorontsovskaya St.
Moscow, Russia
Skype: stanlagun
www.mirantis.com
slagun at mirantis.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140321/152b120a/attachment.html>


More information about the OpenStack-dev mailing list