[openstack-dev] [Heat] Using Job Queues for timeout ops
visnusaran.murugan at hp.com
Thu Nov 13 12:58:44 UTC 2014
Parallel worker was what I initially thought. But what to do if the engine hosting that worker goes down?
From: Angus Salkeld [mailto:asalkeld at mirantis.com]
Sent: Thursday, November 13, 2014 5:22 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Heat] Using Job Queues for timeout ops
On Thu, Nov 13, 2014 at 6:29 PM, Murugan, Visnusaran <visnusaran.murugan at hp.com<mailto:visnusaran.murugan at hp.com>> wrote:
Convergence-POC distributes stack operations by sending resource actions over RPC for any heat-engine to execute. Entire stack lifecycle will be controlled by worker/observer notifications. This distributed model has its own advantages and disadvantages.
Any stack operation has a timeout and a single engine will be responsible for it. If that engine goes down, timeout is lost along with it. So a traditional way is for other engines to recreate timeout from scratch. Also a missed resource action notification will be detected only when stack operation timeout happens.
To overcome this, we will need the following capability:
1. Resource timeout (can be used for retry)
We will shortly have a worker job, can't we have a job that just sleeps that gets started in parallel with the job that is doing the work?
It gets to the end of the sleep and runs a check.
2. Recover from engine failure (loss of stack timeout, resource action notification)
My suggestion above could catch failures as long as it was run in a different process.
1. Use task queue like celery to host timeouts for both stack and resource.
2. Poll database for engine failures and restart timers/ retrigger resource retry (IMHO: This would be a traditional and weighs heavy)
3. Migrate heat to use TaskFlow. (Too many code change)
I am not suggesting we use Task Flow. Using celery will have very minimum code change. (decorate appropriate functions)
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OpenStack-dev