[openstack-dev] [AODH] event-alarm timeout discussion

gordon chung gord at live.ca
Thu Sep 22 22:38:31 UTC 2016



On 22/09/2016 2:40 AM, Zhai, Edwin wrote:
>
> See
> https://github.com/openstack/aodh/blob/master/aodh/evaluator/event.py#L158
>
> evaluate_events is the handler of the endpoint for 'alarm.all', it
> iterates the event list and evaluate them one by one with project
> alarms. If both 'timeout.end' and 'X' are in the event list, I assume
> they are handled in sequence at different iterations of for loop. Am I
> right?

not exactly. the code above is actually an endpoint for event listener. 
the event listener itself is threaded so in theory, we have 64 of these 
endpoints/loops. you can override the threads to have just one but 
that's where things slow down a lot. we handle this in ceilometer by 
having many single thread listeners each handling it's own queue[1]. i 
still need to publish diagram on how that works.

[1] 
https://github.com/openstack/ceilometer/blob/master/ceilometer/notification.py#L308

>
> If we have evaluate_timeout_events as handler of another endpoint for
> 'alarm.timeout', then 2 handlers can run concurrently to lead race
> condition. I'm not familiar with underline oslo notifications, and think
> separated queue is different story. Pls. correct me if I'm wrong.
>

deleted your sequence diagram since it's malformed in my response but 
that is pretty cool.

a few questions:
- when alarm creation event arrives at evaluator it creates a thread to 
process alarm. this thread will timeout and raise a new event if it 
doesn't receive event in time? i don't understand why we need a 
timeout.end event? can the evaluator not just update_alarm and notify if 
we timeout? or update_alarm and skip notify if we receive event on time?

cheers,

-- 
gord



More information about the OpenStack-dev mailing list