[openstack-dev] [Vitrage] About alarms reported by datasource and the alarms generated by vitrage evaluator

Afek, Ifat (Nokia - IL) ifat.afek at nokia.com
Sun Jan 15 13:06:21 UTC 2017


From: Yujun Zhang <zhangyujun+zte at gmail.com>
Date: Thursday, 12 January 2017 at 17:37

On Thu, Jan 12, 2017 at 5:12 PM Afek, Ifat (Nokia - IL) <ifat.afek at nokia.com<mailto:ifat.afek at nokia.com>> wrote:

'deduced' vs 'monitored' would be good enough for most cases. Unless we have identify some real use case, I also think there is no need for bring in quantitative indicator like counter or probability.

[Ifat] I agree.

Personally, I don’t think this is needed. I think that if Nagios reports an error, then it is confident enough without getting it from another monitor.

You are right. We would consider a reported alarm as a reliable indicator of fault. What I was thinking about is: when we the alarm is not seen, can we be sure there is no fault?

Another situation is slow upstream alarm with fast downstream alarm. I don't have an actual example for the moment, so please allow me to imagine an extreme condition.

Suppose host fault will cause instance fault. But due to some restriction, the host fault is scanned every 1 hour, but instance fault can be scanned every 1 second. Now, we get alarms from 10 instance in the same host. Can we deduce that the host is likely in fault status? And we may raise a "deduced" alarm on the host and trigger an immediate scan which may result in a "monitored" alarm. In this way, we reduce the time of detecting the root cause, i.e host fault.

[Ifat] I understand the use case.


An alternative solution is to distinguish fault from alarm. Alarm is actually a reflection of fault status.  Beside the directly linked alarm, fault status can also be deduced from downstream alarms. I haven't think over this model yet, it just flashed over my mind. Any comments are welcome.

[Ifat] Isn’t ‘fault vs. alarm’ just a different terminology for ‘deduced vs. monitored’?


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20170115/b70c14fd/attachment.html>


More information about the OpenStack-dev mailing list