[openstack-dev] [AODH] event-alarm timeout discussion

Zhai, Edwin edwin.zhai at intel.com
Fri Sep 23 06:18:49 UTC 2016


Thanks for your clarification, see my comments below.

On Thu, 22 Sep 2016, gordon chung wrote:

>
>
> On 22/09/2016 2:40 AM, Zhai, Edwin wrote:
>>
>> See
>> https://github.com/openstack/aodh/blob/master/aodh/evaluator/event.py#L158
>>
>> evaluate_events is the handler of the endpoint for 'alarm.all', it
>> iterates the event list and evaluate them one by one with project
>> alarms. If both 'timeout.end' and 'X' are in the event list, I assume
>> they are handled in sequence at different iterations of for loop. Am I
>> right?
>
> not exactly. the code above is actually an endpoint for event listener.
> the event listener itself is threaded so in theory, we have 64 of these
> endpoints/loops. you can override the threads to have just one but
> that's where things slow down a lot. we handle this in ceilometer by
> having many single thread listeners each handling it's own queue[1]. i
> still need to publish diagram on how that works.
>
> [1]
> https://github.com/openstack/ceilometer/blob/master/ceilometer/notification.py#L308

There are many targets(topics)/endpoints in above ceilometer code. But in AODH, 
we just have one topic, 'alarm.all', and one endpoint. If it is still 
multi-threaded, there is already potential race condition here, but event-alarm 
tiemout make it worse.

https://github.com/openstack/aodh/blob/master/aodh/event.py#L61-L63

>
> deleted your sequence diagram since it's malformed in my response but
> that is pretty cool.
>
> a few questions:
> - when alarm creation event arrives at evaluator it creates a thread to
> process alarm. this thread will timeout and raise a new event if it
> doesn't receive event in time? i don't understand why we need a
> timeout.end event? can the evaluator not just update_alarm and notify if
> we timeout? or update_alarm and skip notify if we receive event on time?

event evaluator is triggered by event only, that is, it's not called at all 
until next event comes. If no event comes, evaluator just sleeps so that can't 
check timeout and update_alarm. In other words, 'timeout.end' is just for waking 
up evaluator.

>
> cheers,
>
> -- 
> gord
>

Best Rgds,
Edwin



More information about the OpenStack-dev mailing list