[openstack-dev] [aodh][vitrage] Aodh generic alarms

Julien Danjou julien at danjou.info
Tue Jan 24 10:58:09 UTC 2017


On Tue, Jan 24 2017, Afek, Ifat (Nokia - IL) wrote:

Hi Ifat,

> We understood that Aodh aims to be OpenStack alarming service, which is much
> more than an ‘engine of alarm evaluation’ (as you wrote in your comment in
> gerrit).

Well, currently it's not really more than that. We've been to the path
of "more more more and more" in Ceilometer and I don't think anybody can
say it had great results – so you can understand how unadventurous and
cautious we are in adding more things in Aodh.

> If I may describe another use case for generic alarms - of OPNFV
> Doctor: A monitor notifies about an alarm, e.g. a NIC failure. The inspector
> (Vitrage in this case) receives the alarm, understands that the host is
> affected, and raises an alarm on the host.
> This is currently implemented by Vitrage calling nova force-down, and
> Nova sending a notification that is converted to an event and then
> consumed by an Aodh event-alarm.

I don't see why Vitrage must be involved in this scenario. If a
"monitor" sees something e.g. a NIC failure, it should send a event
stating that and Aodh could trigger an alarm.
This alarm could call nova force-down, etc…

> As the next phase in Doctor use case, for performance reasons, they might want
> Vitrage to raise alarms also on the instances and applications [3]. We know how
> to raise these alarms, and we can send them directly to a VNFM or another
> consumer. But we thought the right thing to do was to raise these alarms in
> Aodh, and let the VNFM connect to Aodh. This is what I mean by ‘Aodh as the
> alarming service of OpenStack’. 

Part of the problem is that Vitrage is a different evaluation engine –
external to Aodh — and wants to use Aodh as a data storage (to store
alarms, metadata and then trigger actions). So since the evaluation
engine (Vitrage) is external to Aodh, it tends to bend Aodh to something
it's not (a data storage for alarms).

You mention performances reasons but if you really want performances,
the real way to achieve them is to:
Option 1: provide Vitrage functionalities embedded into Aodh as an
         evaluation engine
Option 2: manage Vitrage alarms inside Vitrage directly

It seems Vitrage decided not to pick option 2 because Aodh exists, which
I think is a really good thing. Option 1 has not been picked, not based
on technical issues, but on the social challenge that it represents. It
means implementing (part of) Vitrage features in Aodh directly, which
involves can be complicated as it means joining an existing project. :)

> What do you think about this use case? do you want Aodh to take this role, as
> the place where all OpenStack alarms are gathered and managed?

I think that particular use case is valid, but the way I understand it,
it barely needs Vitrage. It could/should be just Aodh doing this.
(Or maybe I just misunderstood your use case, feel free to explain
further :)
> Now, about the details. 
>
> In his first commit, alexey_weyl suggested to add metadata, and you asked him
> to call it ‘userdata’. Personally, I think that metadata is more accurate. It
> is legitimate for an alarm to have additional data, in our example we need to
> hold the resource id and an external alarm id. When you call it userdata, it
> indeed sounds like ‘a user datastore’ (in your words), which is not the purpose
> at all. 

The Aodh API is used by _users_. The data that are set in this in this
field are set by _users_. Vitrage is an _user_ of the Aodh API. That's
what I think they should be called userdata: Aodh has no use of this
data. It's just a random payload that has no usage for Aodh.

Though it's interesting that you mention it because I think it
highlights how we might differ on how Aodh/Vitrage should interact.
You're on the Vitrage side, so you basically see Aodh as being
completely encompassing Aodh and "absorbed" by Vitrage and its
use-cases. I guess it's normal, but that would lead to terrible design
decision and generally bad UX for Aodh.

I would agree for this field to be metadata, if it was used by Aodh as
metadata used to _evaluate_ the alarm. But that's not the case, unless
you move Vitrage evaluation engine inside Aodh, which could be
interesting, but is a different way of building things.

I hope I made things clearer. :) I have no intention on blocking our
cooperation whatsoever, I'm just trying actually to bring the two
projects closer as I am not even sure there should be two entirely
distinct projects. But I don't think we should do technical bending to
bypass social or political issues – we've done that before, and it blew
up in our face later.

Cheers,
-- 
Julien Danjou
# Free Software hacker
# https://julien.danjou.info
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 800 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20170124/8396a423/attachment.pgp>


More information about the OpenStack-dev mailing list