[openstack-dev] [ceilometer] The periodic task on openstack
Eoghan Glynn
eglynn at redhat.com
Fri Nov 16 14:32:29 UTC 2012
> >>> A follow-up for your discussion on IRC last night, I had a quick
> >>> check of the loopingcall implementation, and seems there is
> >>> really
> >>> no time gurantee. Even if we adjust the interval for
> >>> greenthread.sleep() dynamically, we also must make the metering
> >>> periodic task always at the head.
> >>>
> >>> I think we can enhance the periodic task to meet partially our
> >>> requirement, like create a separated task type as high-priority,
> >>> which should make sure no long-delay.
> >>>
> >>> The only concern is, can we make sure the LoopingCall itself will
> >>> be invoked on time after the greenthread.sleep(interval),
> >>> considering the attribute of greenthread, or even python thread.
> >>>
> >>> openstack/common/loopingcall.py-> LoopingCall(object):
> >>>
> >>> while self._running:
> >>> self.f(*self.args, **self.kw)
> >>> if not self._running:
> >>> break
> >>> greenthread.sleep(interval)
> >
> > Thanks for following this up!
> >
> > So can I confirm that I've understood the basic issues here are
> > that:
> >
> > (a) The time spent executing tasks is not accounted for when
> > determining how much time to sleep between task runs. So
> > for example if periodic_interval is set to N seconds, the
> > actual time between tasks is of the order of:
> >
> > N + \Sigma duration(task_i)/(1 + ticks for task_i)
> >
> > The more tasks with ticks=0, and the longer the task
> > duration, the more we skew away from tasks executing on
> > wall-clock boundaries.
> >
> > (b) There is no guarantee (beyond convention) that a task won't
> > take longer than periodic_interval/|tasks| to execute.
> >
> > (c) There is an indeterminate lag after the expiry of the sleep
> > interval before the LoopingCall thread is re-scheduled.
> >
> > So could we at least address issue (a) by simply subtracting
> > the duration of the last tasks run from the next sleep interval?
> >
> > e.g. change LoopingCall.start()._inner() as follows:
> >
> > while self._running:
> > + start = datetime.datetime.now()
> > self.f(*self.args, **self.kw)
> > + end = datetime.datetime.now()
> > + delta = end - start
> > + elapsed = delta.seconds + delta.microseconds/(10 ** 6)
> > + delay = interval - elapsed
> > if not self._running:
> > break
> > - greenthread.sleep(interval)
> > + greenthread.sleep(delay if delay > 0 else 0)
> >
> > I guess that's what you meant by adjusting the interval
> > dynamically?
>
> > But I'm not sure that we can always address (b) or (c) even with a
> > special thread for high-priority/time-sensitive tasks.
>
> I've noticed that too. The default interval is 60 seconds.
> However, on a system in a lab environment I saw the tasks taking
> 15-20 seconds. I imagine that on more heavily loaded systems with a
> lot of instances, it seems likely that (b) could occur.
>
> We could do what you're suggesting, but also parallelize the tasks
> using a threadpool (of real threads) and only kick off a task if it
> has finished running from its last scheduled run. Does that seem
> reasonable?
Yeah, that could certainly improve matters (similarly with jd_'s
earlier suggestion).
However, here's a concern (that may derive from my ignorance of
green versus real threads) ... if we use a pool of real threads,
would this be problematic when a task touches codepaths previously
monkey-patched to do the cooperative yielding required of potentially
blocking code in order to be eventlet-friendly?
Its almost as if we'd want the standard libs to be unmonkey-patched
when executing on real threads so as to avoid needlessly yielding and
hence chewing up more elapsed time to task completion (if that makes
any sense ...).
Cheers,
Eoghan
More information about the OpenStack-dev
mailing list