[openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

AFEK, Ifat (Ifat) ifat.afek at alcatel-lucent.com
Mon Dec 7 07:44:16 UTC 2015


Hi Ryota,

> -----Original Message-----
> From: Ryota Mibu [mailto:r-mibu at cq.jp.nec.com]
> Sent: Friday, December 04, 2015 9:42 AM
> 
> > The next step can happen if and when Aodh supports alarm templates.
> > If Vitrage can handle about 30 alarm types, and there are 100
> > instances, we don't want to pre-configure 3000 alarms, which most
> likely will never be triggered.
> 
> I understand your concern. Aodh is user facing service, so having lots
> of alarms doesn't make sense.
> 
> Can we clarify use case again in terms of service role definition?

Our use cases focus on giving value to the cloud admin, who will be 
able to:

- view the topology of his environment, the relations between the 
physical, virtual and applicative layer and the statuses all resources
- view the alarms history
- view alarms about problems that Vitrage deduced could happen, even
if no other OpenStack component reported these problems (yet)
- view RCA information about the alarms

> 
> Aodh provides alarming mechanism to *notify* events and situations
> calculated from various data sources. But, original/master information
> of resource including latest resource state is owned by other services
> such as nova.
> 
> So, user who wants to know current resource state to find out dead
> resources (instances), can simply query instances via nova api. And,
> user who wants to know when/what failure occurred can query events via
> ceilometer api. Aodh has alarm state and history though.

I'm not sure I fully understand the difference between Aodh events and 
alarms. If the user wants to know what failure occurred, is it part of 
Aodh events, alarms, or both?

> > > OK. The 'combination' type alarm enables you to aggregate multiple
> > > alarm to one alarm. This can be used when you want to receive alarm
> > > when the both of physical NIC ports are downed to recognize logical
> > > connection unavailability if the ports are teamed for redundancy.
> > > Now, the combination alarms are evaluated periodically that means
> > > you can receive combination alarm not on-the-fly while you are
> using
> > > event alarms as source of combination alarm though.
> >
> > I think I understand your point. It means that certain alarms will
> > arrive to Vitrage in delay, due to your evaluation policy. I think we
> will have to address this issue at some point, but it won't change our
> overall design.
> 
> Yes, I'm just curious if there is any user can get benefit from this
> improvement to set priority.

I don't see a need for that improvement in our current use cases. Not so
sure about the future use cases, I will keep this limitation in mind.

Best Regards,
Ifat.





More information about the OpenStack-dev mailing list