[openstack-dev] [Horizon] Ceilometer Alarm management page
Ladislav Smola
lsmola at redhat.com
Tue Sep 24 06:48:28 UTC 2013
Hello Julien,
thank you very much for your response. I have commented it in-line. I
have already started working on 1., I will plan the others, depending on
the feedback.
Ladislav
On 09/23/2013 03:33 PM, Julien Danjou wrote:
> On Thu, Sep 19 2013, Ladislav Smola wrote:
>
> Hi Ladislav,
>
> Sorry for the late reply,
>
>> 1. The points 1-4 from are some sort simple version of the page, that uses
>> all basic alarm-api features. Do you think we need them all? Any feedback
>> for them? Enhancements?
> That looks like a really good start if we can have all of this!
>
>> 2. There is a thought, that we should maybe divide Alarms into (System,
>> User-defined). The only system alarms now, are set up with Heat and used for
>> auto-scaling.
> I don't think there is any formal way to distinguish alarms. Though it's
> likely you can retrieve the alarm list Heat created for the user to
> distinguish them.
> On the other hand, I am not sure the user can see the alarms created by
> Heat since they might not directly belong to the user, but to Heat.
I have already talked about this with eglynn, recognition by the user could work. Now,
I am talking about the Admin role, which has rights to observe/manage alarms of all users,
I think.
>> 3. There is a thought about watching correlation of multiple alarm histories
>> in one Chart (either Alarm Histories, or the real statistics the Alarm is
>> defined by). Do you think it will be needed? Any real life examples you have
>> in mind?
> I think the first use case is to debug combined alarms.
> There's also a lot of potential to debug an entire platform activity by
> superimposing several alarm graphs.
Yes debugging combined alarms and superimposing them makes sense, also
as I've talked
with eglynn, we can show alarm history as a chart of real statistics
values, instead of
just (alarm, ok, insufficient data) states. That could present the
alarms pretty good
and you could visually check the thresholds (in the future probably
showing a thresholds
in the chart, and marking somehow the values that will exceed them)
>> 4. There is a thought about tagging the alarms by user defined tag, so user
>> can easily group alarms together and then watch them together based on their
>> tag.
> The alarm API don't provide that directly, but you can imagine some sort
> of filter based on description matching some texts.
Yes, this could definitely work as a first version, if we decide this is
needed feature.
The implementation of tags could be changed later if needed (probably to
something
more optimized to query).
>> 5. There is a thought about generating a default alarms, that could observe
>> the most important things (verifying good behaviour, showing bad behaviour).
>> Does anybody have an idea which alarms could be the most important and
>> usable for everybody?
> I'm not sure you want to create alarm by default; alarm are resources, I
> don't think we should create resources without the user asking for it.
>
> Maybe you were talking about generating alarm template? You could start
> with things like CPU usage staying at >90% for more than 1 hour, and
> having an action that alerts the user via mail.
> Same for disk usage.
Well for example, if we find metrics, that can be used for measuring health
(this is probably more undercloud talking, or hardware metrics in general),
we could do something like "I want this alarm on all resources of this
type",
if there will be e.g. 100s of the resources of the same type, it would
be pretty
dull to connect alarm to each of them, or to decide to change them.
Btw. it doesn't have to be a list of resource ids, but once the
sample-api is finished,
it can be any query, that will produce a list of resources, tenants,
etc.. (anything that
will allowed to be grouped by)
So it could serve as some kind of alarm groups management (let's say the
group
is tagged somehow so you can recognize it ^^), it would add alarm on
adding a
new resource and you could manage all alarms by one form.
Then when we have some alarm groups, that will be likely used by 80% of the
clouds, we could e.g. switch them on as default for Admins. Then Admin
could
change the alarm group, or delete it if needed.
And yes, preparing a general templates is also a good idea, probably
categorized by use case.
Users will have something pre-prepared, and they can set the most used
Alarms
without need of reading the whole docs.
>> 6. There is a thought about making overview pages customizable by the users,
>> so they can really observe, what they need. (includes Ceilometer statistics
>> and alarms)
> I think that could be as easy as picking the alarms you want in
> overviews with a very small and narrowed graph.
>
True that. The more complex version would be to save any complex query
of some general chart e.g. cpu_util, grouped by tenant, in last week. Showed
on some grid system. -> though this is more distant future.
Let's start simple as you say, just picking the alarms. And there would
be a one
select-box, that would change a time period of all displayed charts.
This should be
a piece of cake.
More information about the OpenStack-dev
mailing list