[openstack-dev] [AODH] event-alarm timeout discussion
Zhai, Edwin
edwin.zhai at intel.com
Fri Sep 23 06:18:49 UTC 2016
Thanks for your clarification, see my comments below.
On Thu, 22 Sep 2016, gordon chung wrote:
>
>
> On 22/09/2016 2:40 AM, Zhai, Edwin wrote:
>>
>> See
>> https://github.com/openstack/aodh/blob/master/aodh/evaluator/event.py#L158
>>
>> evaluate_events is the handler of the endpoint for 'alarm.all', it
>> iterates the event list and evaluate them one by one with project
>> alarms. If both 'timeout.end' and 'X' are in the event list, I assume
>> they are handled in sequence at different iterations of for loop. Am I
>> right?
>
> not exactly. the code above is actually an endpoint for event listener.
> the event listener itself is threaded so in theory, we have 64 of these
> endpoints/loops. you can override the threads to have just one but
> that's where things slow down a lot. we handle this in ceilometer by
> having many single thread listeners each handling it's own queue[1]. i
> still need to publish diagram on how that works.
>
> [1]
> https://github.com/openstack/ceilometer/blob/master/ceilometer/notification.py#L308
There are many targets(topics)/endpoints in above ceilometer code. But in AODH,
we just have one topic, 'alarm.all', and one endpoint. If it is still
multi-threaded, there is already potential race condition here, but event-alarm
tiemout make it worse.
https://github.com/openstack/aodh/blob/master/aodh/event.py#L61-L63
>
> deleted your sequence diagram since it's malformed in my response but
> that is pretty cool.
>
> a few questions:
> - when alarm creation event arrives at evaluator it creates a thread to
> process alarm. this thread will timeout and raise a new event if it
> doesn't receive event in time? i don't understand why we need a
> timeout.end event? can the evaluator not just update_alarm and notify if
> we timeout? or update_alarm and skip notify if we receive event on time?
event evaluator is triggered by event only, that is, it's not called at all
until next event comes. If no event comes, evaluator just sleeps so that can't
check timeout and update_alarm. In other words, 'timeout.end' is just for waking
up evaluator.
>
> cheers,
>
> --
> gord
>
Best Rgds,
Edwin
More information about the OpenStack-dev
mailing list