[openstack-dev] [Heat] Using Job Queues for timeout ops

Joshua Harlow harlowja at outlook.com
Thu Nov 13 08:45:07 UTC 2014


A question;

How is using something like celery in heat vs taskflow in heat (or at least concept [1]) 'to many code change'.

Both seem like change of similar levels ;-)

What was your metric for determining the code change either would have (out of curiosity)?

Perhaps u should look at [2], although I'm unclear on what the desired functionality is here.

Do u want the single engine to transfer its work to another engine when it 'goes down'? If so then the jobboard model + zookeper inherently does this.

Or maybe u want something else? I'm probably confused because u seem to be asking for resource timeouts + recover from engine failure (which seems like a liveness issue and not a resource timeout one), those 2 things seem separable.

[1] http://docs.openstack.org/developer/taskflow/jobs.html

[2] http://docs.openstack.org/developer/taskflow/examples.html#jobboard-producer-consumer-simple

On Nov 13, 2014, at 12:29 AM, Murugan, Visnusaran <visnusaran.murugan at hp.com> wrote:

> Hi all,
>  
> Convergence-POC distributes stack operations by sending resource actions over RPC for any heat-engine to execute. Entire stack lifecycle will be controlled by worker/observer notifications. This distributed model has its own advantages and disadvantages.
>  
> Any stack operation has a timeout and a single engine will be responsible for it. If that engine goes down, timeout is lost along with it. So a traditional way is for other engines to recreate timeout from scratch. Also a missed resource action notification will be detected only when stack operation timeout happens.
>  
> To overcome this, we will need the following capability:
> 1.       Resource timeout (can be used for retry)
> 2.       Recover from engine failure (loss of stack timeout, resource action notification)
>  
>  
> Suggestion:
> 1.       Use task queue like celery to host timeouts for both stack and resource.
> 2.       Poll database for engine failures and restart timers/ retrigger resource retry (IMHO: This would be a traditional and weighs heavy)
> 3.       Migrate heat to use TaskFlow. (Too many code change)
>  
> I am not suggesting we use Task Flow. Using celery will have very minimum code change. (decorate appropriate functions)
>  
>  
> Your thoughts.
>  
> -Vishnu
> IRC: ckmvishnu
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20141113/38009910/attachment.html>


More information about the OpenStack-dev mailing list