[openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
AFEK, Ifat (Ifat)
ifat.afek at alcatel-lucent.com
Tue Dec 1 13:12:27 UTC 2015
Hi,
After some further discussions with Vitrage team, let me go one step back and ask a more basic question:
In Vitrage, we would like to evaluate and correlate different kinds of alarms: AODH threshold alarms, event alarms, Nagios alarms, Ganglia alarms, Zabbix alarms, etc. This includes alarms on physical resources that are not part of OpenStack, like switches or ports, in order to understand their effect on OpenStack resources.
Our question is: do you vision AODH as a "general OpenStack alarm engine", which serves as a database for alarms of all kinds? Or does AODH focus on metric-related alarms?
Thanks,
Ifat.
> -----Original Message-----
> From: AFEK, Ifat (Ifat) [mailto:ifat.afek at alcatel-lucent.com]
> Sent: Monday, November 30, 2015 2:47 PM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom
> alarms in AODH
>
> Hi,
>
> A few days ago I sent you this email (see below). Resending in case you
> didn't see it.
> If you could get back to me soon it would be most appreciated, as we
> are quite blocked with our AODH integration right now.
>
> Thanks,
> Ifat.
>
>
> -----Original Message-----
> From: AFEK, Ifat (Ifat) [mailto:ifat.afek at alcatel-lucent.com]
> Sent: Tuesday, November 24, 2015 7:37 PM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom
> alarms in AODH
>
> Hi Gord, Hi Ryota,
>
> (I sent the same mail again in a more readable format)
>
> Thanks for your detailed responses.
> Hope you don't mind that I'm sending one reply to both of your emails.
> I think it would be easier to have one thread for this discussion.
>
>
> Let me explain our use case in more details.
> Here is an example of how we would like to integrate with AODH. Let me
> know what you think about it.
>
> 1. Vitrage gets an alarm from Nagios about high cpu load on one of the
> hosts
>
> 2. Vitrage evaluator decides (based on its templates) that an "instance
> might be suffering due to high cpu load on the host" alarm should be
> triggered for every instance on this host
>
> 3. Vitrage notifier creates corresponding alarm definitions in AODH
>
> 4. AODH stores these alarms in its database
>
> 5. Vitrage triggers the alarms
>
> 6. AODH updates the alarms states and notifies about it
>
> 7. Horizon user queries AODH for a list of all alarms (we are currently
> checking the status of a blueprint that should implement it[2]). AODH
> returns a list that includes the alarms that were triggered by Vitrage.
>
> 8. Horizon user selects one of the alarms that Vitrage generated, and
> asks to see its root cause (we will create a new blueprint for that).
> Vitrage API returns the RCA information for this alarm.
>
>
> Our current discussion is on steps 3-6 (as far as we understand, and
> please correct me if I'm wrong, nothing blocks the implementation of
> the blueprint for step 7).
>
>
>
> Looking at AODH API again, here is what I think we need to do:
>
> 1. Define an alarm with an external_trigger_rule or something like
> that. This alarm has no metric data. We just want to be able to trigger
> it and query its state.
>
> 2. Use AODH API for triggering this alarm. Will "PUT
> /v2/alarms/(alarm_id)/state" do the job?
>
>
> Please see also my comments below.
>
> Thanks,
> Ifat.
>
>
> [2] https://blueprints.launchpad.net/horizon/+spec/ceilometer-alarm-
> management-page
>
>
>
>
> > -----Original Message-----
> > From: gord chung [mailto:gord at live.ca]
> > Sent: Monday, November 23, 2015 9:45 PM
> > To: openstack-dev at lists.openstack.org
> > Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising
> > custom alarms in AODH
> >
> >
> >
> > On 23/11/2015 11:14 AM, AFEK, Ifat (Ifat) wrote:
> > > I guess I would like to do both: create a new alarm definition,
> then
> > > trigger it (call alarm_actions), and possibly later on set its
> state
> > > back to OK (call ok_action).
> > > I understood that currently all alarm triggering is internal in
> > > AODH, according to threshold/events/combination alarm rules. Would
> > > it be possible to add a new kind of rule, that will allow
> triggering
> > > the alarm externally?
> > what type of rule?
> >
> > i have https://review.openstack.org/#/c/247211 which would
> > theoretically allow you to push an action into queue which would then
> > trigger appropriate REST call. not sure if it helps you plug into
> Aodh
> > easier or not?
>
> We need to add an alarm definition with an "external_rule", and then
> trigger it. It is important for us that the alarm definition will be
> stored in AODH database for future queries. As far as I understand, the
> queue should help only with the triggering?
>
> >
> > --
> > gord
>
>
> > -----Original Message-----
> > From: Ryota Mibu [mailto:r-mibu at cq.jp.nec.com]
> > Sent: Tuesday, November 24, 2015 10:00 AM
> > To: OpenStack Development Mailing List (not for usage questions)
> > Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising
> > custom alarms in AODH
> >
> > Hi Ifat,
> >
> >
> > Thank you for starting discussion how AODH can be integrated with
> > Vitrage that would be a good example of AODH integration with other
> > OpenStack components.
> >
> > The key role of creating alarm definition is to set endpoint
> > (alarm_actins) which can be receive alarm notification from AODH. How
> > the endpoints can be set in your use case? Those endpoints are
> > configured via virtage API and stored in its DB?
>
> We have a graph database that will include resources and alarms
> imported from few sources of information (including Ceilometer), as
> well as alarms generated by Vitrage. However, we would like our alarms
> to be stored in AODH as well. If I understood you correctly, we will
> need the endpoints in order to be notified on Ceilometer alarms.
>
> >
> > I agree with Gordon, you can use even-alarm with generating "event"
> > containing alarming message that can be captured in aodh if vitrage
> > relay the alarm definition to aodh. That is more feasible way rather
> > than creating alarm definition right before triggering alarm
> > notification. The reason is that aodh evaluator may not be aware of
> > new alarm definitions and won't send notification until its alarm
> > definition cache is refreshed in less than 60 sec (default value).
>
> Logically speaking, we would like to create alarms and not events. Our
> goal is to alert when something is wrong. Creating events might work as
> a workaround, but this is not our preferred solution.
>
> >
> > Having special rule and external evaluator would be alternative, but
> > it should be difficult to catch up latest aodh, since it will be
> > changed faster with small code base as result of split from
> ceilometer.
>
> We are not asking for an external evaluator. We are asking for a null
> evaluator, and for an API that will allow us to trigger an alarm
> externally. The evaluation itself will be in Vitrage internal code,
> with no dependency in AODH evaluator.
>
> >
> >
> > BR,
> > Ryota
> >
> > > -----Original Message-----
> > > From: AFEK, Ifat (Ifat) [mailto:ifat.afek at alcatel-lucent.com]
> > > Sent: Tuesday, November 24, 2015 1:15 AM
> > > To: OpenStack Development Mailing List (not for usage questions)
> > > Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising
> > > custom alarms in AODH
> > >
> > > Hi Gord,
> > >
> > > Please see my answers below.
> > >
> > > Ifat.
> > >
> > >
> > > > -----Original Message-----
> > > > From: gord chung [mailto:gord at live.ca]
> > > > Sent: Monday, November 23, 2015 4:57 PM
> > > > To: openstack-dev at lists.openstack.org
> > > > Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising
> > > > custom alarms in AODH
> > > >
> > > > hi Ifat,
> > > >
> > > > i added some questions below.
> > > >
> > > > On 23/11/2015 7:16 AM, AFEK, Ifat (Ifat) wrote:
> > > > > Hi,
> > > > >
> > > > > We have a couple of questions regarding AODH alarms.
> > > > >
> > > > > In Vitrage[1] project we have two use cases that involve
> > Ceilometer:
> > > > >
> > > > > 1. Import Ceilometer alarms, as well as alarms and resources
> > > > > from
> > > > other sources (Nagios, Zabbix, Nova, Heat, etc.), and produce RCA
> > > > insights about the connection between different alarms.
> > > > to clarify, Ceilometer alarms is deprecated for Aodh and will be
> > > > removed very, very soon.
> > >
> > > Right, I meant Aodh alarms.
> > >
> > > >
> > > > > 2. Raise "deduced alarms". For example, in case we detect a
> high
> > > > memory consumption on a host, we would like to raise deduced
> > > > alarms saying "instance might be suffering due to high memory
> > > > consumption on the host" on all related instances. Then, we can
> > > > further deduce that applications running on these instances might
> > > > also be
> > affected,
> > > > and raise alarms on them as well.
> > > > >
> > > > > Initially we planned to raise these deduced alarms in AODH, so
> > > > > other
> > > > Openstack components may consume them as well. Then, when we
> > > > looked at AODH alarms documentation, we noticed that there is
> > > > currently no way of raising custom alarms. We saw only three
> types of alarms:
> > > > threshold alarms, combination alarms and event alarms.
> > > > >
> > > > > So, our questions are:
> > > > >
> > > > > * Is there an alternative way of raising alarms in AODH?
> > > > what do we mean by raising alarms? do you want to create a new
> > alarm
> > > > definition for Aodh or do you want to trigger an action? do you
> > want
> > > > to have a new non-REST action?
> > >
> > > I guess I would like to do both: create a new alarm definition,
> then
> > > trigger it (call alarm_actions), and possibly later on set its
> state
> > back to OK (call ok_action).
> > > I understood that currently all alarm triggering is internal in
> > > AODH, according to threshold/events/combination alarm rules. Would
> > > it be
> > possible to add a new kind of rule, that will allow triggering the
> > alarm externally?
> > >
> > > >
> > > > > * Do you think custom alarms belong in AODH? Are you interested
> > in
> > > > adding this capability to AODH?
> > > > >
> > > > > We would be happy to hear your vision and thoughts about it.
> > > > >
> > > > >
> > > > > Thanks,
> > > > > Ifat and Alexey.
> > > > >
> > > > >
> > > > > [1] https://wiki.openstack.org/wiki/Vitrage
> > > > >
> > > > >
>
>
> _______________________________________________________________________
> ___
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-
> request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> _______________________________________________________________________
> ___
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-
> request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> _______________________________________________________________________
> ___
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-
> request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
More information about the OpenStack-dev
mailing list