[openstack-dev] [Mistral] How Mistral handling long running delegate tasks

Renat Akhmerov rakhmerov at mirantis.com
Wed Apr 2 07:43:27 UTC 2014


On 02 Apr 2014, at 13:38, Kirill Izotov <enykeev at stackstorm.com> wrote:

> I agree that we probably need new engine for that kind of changes and, as Renat already said in another thread, lazy model seems to be more basic and it would be easier to build sync engine on top of that rather than other way around. Yes, it will entail a lot of changes in engines that are currently here, but it seems like the only way to get something that would fit us both.
> 
> Since it seems like we are getting some kind of agreement here, we should probably start shifting to the plane where we discuss the design of the new engine, rather than the need of one. The idea has been spread over to many places, so i'll try to gather it back. 
> 
> Lazy engine should be async and atomic, it should not have its own state, instead it should rely on on some kind of global state (db or in-memory, depending on a type of application). I should have at least two methods: run and task_complete. Run method should calculate the first batch of tasks and schedule them to execution (either put them in queue or spawn the threads... or send a pidgin, i insist =). Task_complete should mark a certain task to be completed and then schedule the next batch of tasks that became available due to resolution of this one.
> 
> Then, on top of it you can build sync engine by introducing Futures. You are using async.run() to schedule the tasks by transforming them to Futures and then starting a loop, checking Futures for completion and sending their results to async.task_complete() which would produce even more Futures to check over. Just the same way TaskFlow do it right now.
> 
> On the Mistral side we are using Lazy engine by patching async.run to the API (or engine queue) and async.task_complete to the worker queue result channel (and the API for long running tasks). We still sharing the same graph_analyzer, but instead of relying on loop and Futures, we are handling execution in a scalable and robust way.
> 
> The reason i'm proposing to extract Futures from async engine is because they won't work if we have multiple engines that should handle the task results concurrently (and without that there will be no scalability).

I agree with all the major points here. It sounds like a plan.

> What i see is bothering Joshua is that worker that handles long running task may just die in a process and there is no way for us to know for sure is it still working or not. It is more important for sync engine because without that it would just hang forever (though it is not clear do sync engine needs long running tasks at all?). In case of sync engine, the role of watch-dog may be performed by the loop, while in case of Mistral it might be a separate process (though i bet there should be something for resending the message in the RabbitMQ itself). The common solution is not necessary here.


As for not loosing messages (resending them to the queue) that’s why I mentioned message acknowledgement ([0]). Rabbit does it automatically if a message that has been pulled out of queue should be acknowledged and the corresponding rabbit client connection dies before the client acknowledges the message (at this point there’s a guarantee it’s been processed). It works this way now in Mistral. A slight problem with the current Mistral implementation is that engine works as follows when starting a workflow:

1.
Start DB Transaction
	Make all smart logic and DB operations (updating persistent state of tasks and execution)
Commit DB Transaction
2. Submit task messages to MQ.

So there’s a window between steps 1 and 2 when if the system crashes DB state won’t be consistent with the state of the MQ. From a worker perspective looks like this problem doesn’t exist (at least we thought through it and didn’t find any synchronisation windows). Initially step 2 was a part of step 1 (a part of DB TX) and it would solve the problem but it’s generally a bad pattern to have interprocess communication inside DB TX so we got rid of it.

So this seems to me the only problem with the current architecture, we left it as it was and were going to get back to it after POC. I assume this might be describing the reason why Josh keeps talking about ‘jobboard’ thing, but I just don’t have enough details at this point to say if that’s what we need or not.

One solution I see here is just to consider this situation exceptional from workflow execution model perspective and apply policies to tasks that are persisted in DB but not submitted to MQ (what you guys discussed about retries and all that stuff..). So our watch-dog process could just resubmit tasks that have been hanging in DB longer than a configured period with the state IDLE. Something like that.

[0] http://www.rabbitmq.com/tutorials/tutorial-two-python.html

Renat Akhmerov
@ Mirantis Inc.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140402/2acca214/attachment.html>


More information about the OpenStack-dev mailing list