[openstack-dev] [Horizon] [UX] Design for Alarming and Alarm Management

Liz Blanchard lsurette at redhat.com
Mon Jun 16 18:30:20 UTC 2014


On Jun 16, 2014, at 10:56 AM, Eoghan Glynn <eglynn at redhat.com> wrote:

> 
> Apologies for the top-posting, but just wanted to call out some
> potential confusion that arose on the #os-ceilometer channel earlier
> today.
> 
> TL;DR: the UI shouldn't assume a 1:1 mapping between alarms and
>       resources, since this mapping does not exist in general

Thanks for the clarification on this Eoghan. After reading the IRC chat and e-mail thread I’m now understanding that there are alarms that can be created for things like “Alarm me when a new instance is created” that have nothing to do with monitoring instances. Am I correct? Are there other cases we should consider here? I’ve updated the latest version of wireframes to reflect an example of an alarm like this (See Alarm 4 in tables). Also, I got rid of the required mark on Resource in the Add Alarm modal. I will be sending a link these updated wireframes along with feedback to Christian’s latest comments in the next few minutes...

Best,
Liz

> 
> Background: See ML post[1]
> 
> Discussion: See IRC log [2]
>            Ctrl+F: "Let's see what the UI guys think about it"
> 
> Cheers,
> Eoghan
> 
> [1] http://lists.openstack.org/pipermail/openstack-dev/2014-June/037788.html
> [2] http://eavesdrop.openstack.org/irclogs/%23openstack-ceilometer/%23openstack-ceilometer.2014-06-16.log
> 
> 
> ----- Original Message -----
>> Hi all,
>> 
>> Thanks again for the great comments on the initial cut of wireframes. I’ve
>> updated them a fair amount based on feedback in this e-mail thread along
>> with the feedback written up here:
>> https://etherpad.openstack.org/p/alarm-management-page-design-discussion
>> 
>> Here is a link to the new version:
>> http://people.redhat.com/~lsurette/OpenStack/Alarm%20Management%20-%202014-06-05.pdf
>> 
>> And a quick explanation of the updates that I made from the last version:
>> 
>> 1) Removed severity.
>> 
>> 2) Added Status column. I also added details around the fact that users can
>> enable/disable alerts.
>> 
>> 3) Updated Alarm creation workflow to include choosing the project and user
>> (optionally for filtering the resource list), choosing resource, and
>> allowing for choose of amount of time to monitor for alarming.
>>     -Perhaps we could be even more sophisticated for how we let users filter
>>     down to find the right resources that they want to monitor for alarms?
>> 
>> 4) As for notifying users…I’ve updated the “Alarms” section to be “Alarms
>> History”. The point here is to show any Alarms that have occurred to notify
>> the user. Other notification ideas could be to allow users to get notified
>> of alerts via e-mail (perhaps a user setting?). I’ve added a wireframe for
>> this update in User Settings. Then the Alarms Management section would just
>> be where the user creates, deletes, enables, and disables alarms. Do you
>> still think we don’t need the “alarms” tab? Perhaps this just becomes
>> iteration 2 and is left out for now as you mention in your etherpad.
>> 
>> 5) Question about combined alarms…currently I’ve designed it so that a user
>> could create multiple levels in the “Alarm When…” section. They could
>> combine these with AND/ORs. Is this going far enough? Or do we actually need
>> to allow users to combine Alarms that might watch different resources?
>> 
>> 6) I updated the Actions column to have the “More” drop down which is
>> consistent with other tables in Horizon.
>> 
>> 7) Added in a section in the “Add Alarm” workflow for “Actions after Alarm”.
>> I’m thinking we could have some sort of If State is X, do X type selections,
>> but I’m looking to understand more details about how the backend works for
>> this feature. Eoghan gave examples of logging and potentially scaling out
>> via Heat. Would simple drop downs support these events?
>> 
>> 8) I can definitely add in a “scheduling” feature with respect to Alarms. I
>> haven’t added it in yet, but I could see this being very useful in future
>> revisions of this feature.
>> 
>> 9) Another though is that we could add in some padding for outlier data as
>> Eoghan mentioned. Perhaps a setting for “This has happened 3 times over the
>> last minute, so now send an alarm.”?
>> 
>> A new round of feedback is of course welcome :)
>> 
>> Best,
>> Liz
>> 
>> On Jun 4, 2014, at 1:27 PM, Liz Blanchard <lsurette at redhat.com> wrote:
>> 
>>> Thanks for the excellent feedback on these, guys! I’ll be working on making
>>> updates over the next week and will send a fresh link out when done.
>>> Anyone else with feedback, please feel free to fire away.
>>> 
>>> Best,
>>> Liz
>>> On Jun 4, 2014, at 12:33 PM, Eoghan Glynn <eglynn at redhat.com> wrote:
>>> 
>>>> 
>>>> Hi Liz,
>>>> 
>>>> Two further thoughts occurred to me after hitting send on
>>>> my previous mail.
>>>> 
>>>> First, is the concept of alarm dimensioning; see my RDO Ceilometer
>>>> getting started guide[1] for an explanation of that notion.
>>>> 
>>>> "A key associated concept is the notion of dimensioning which defines the
>>>> set of matching meters that feed into an alarm evaluation. Recall that
>>>> meters are per-resource-instance, so in the simplest case an alarm might
>>>> be defined over a particular meter applied to all resources visible to a
>>>> particular user. More useful however would the option to explicitly
>>>> select which specific resources we're interested in alarming on. On one
>>>> extreme we would have narrowly dimensioned alarms where this selection
>>>> would have only a single target (identified by resource ID). On the other
>>>> extreme, we'd have widely dimensioned alarms where this selection
>>>> identifies many resources over which the statistic is aggregated, for
>>>> example all instances booted from a particular image or all instances
>>>> with matching user metadata (the latter is how Heat identifies
>>>> autoscaling groups)."
>>>> 
>>>> We'd have to think about how that concept is captured in the
>>>> UX for alarm creation/update.
>>>> 
>>>> Second, there are a couple of more advanced alarming features
>>>> that were added in Icehouse:
>>>> 
>>>> 1. The ability to constrain alarms on time ranges, such that they
>>>> would only fire say during 9-to-5 on a weekday. This would
>>>> allow for example different autoscaling policies to be applied
>>>> out-of-hours, when resource usage is likely to be cheaper and
>>>> manual remediation less straight-forward.
>>>> 
>>>> 2. The ability to exclude low-quality datapoints with anomolously
>>>> low sample counts. This allows the leading edge of the trend of
>>>> widely dimensioned alarms not to be skewed by eagerly-reporting
>>>> outliers.
>>>> 
>>>> Perhaps not in a first iteration, but at some point it may make sense
>>>> to expose these more advanced features in the UI.
>>>> 
>>>> Cheers,
>>>> Eoghan
>>>> 
>>>> [1] http://openstack.redhat.com/CeilometerQuickStart
>>>> 
>>>> 
>>>> 
>>>> ----- Original Message -----
>>>>> 
>>>>> Hi Liz,
>>>>> 
>>>>> Looks great!
>>>>> 
>>>>> Some thoughts on the wireframe doc:
>>>>> 
>>>>> * The description of form:
>>>>> 
>>>>>  "If CPU Utilization exceeds 80%, send alarm."
>>>>> 
>>>>> misses the time-window aspect of the alarm definition.
>>>>> 
>>>>> Whereas the boilerplate default descriptions generated by
>>>>> ceilometer itself:
>>>>> 
>>>>>  "cpu_util > 70.0 during 3 x 600s"
>>>>> 
>>>>> captures this important info.
>>>>> 
>>>>> * The metric names, e.g. "CPU Utilization", are not an exact
>>>>> match for the meter names used by ceilometer, e.g. "cpu_util".
>>>>> 
>>>>> * Non-admin users can create alarms in ceilometer:
>>>>> 
>>>>> "This is where admins can come in and
>>>>> define and edit any alarms they want
>>>>> the environment to use."
>>>>> 
>>>>> (though these alarms will only have visibility onto the stats
>>>>> that would be accessible to the user on behalf of whom the
>>>>> alarm is being evaluated)
>>>>> 
>>>>> * There's no concept currently of alarm severity.
>>>>> 
>>>>> * "Should users be able to enable/dis-able alarms."
>>>>> 
>>>>> Yes, the API allows for disabled (i.e. non-evaluated) alarms.
>>>>> 
>>>>> * "Should users be able to own/assign alarms?"
>>>>> 
>>>>> Only admin users can create an alarm on behalf of another
>>>>> user/tenant.
>>>>> 
>>>>> * "Should users be able to acknowledge, close alarms?"
>>>>> 
>>>>> No, we have no concept of ACKing an alarm.
>>>>> 
>>>>> * "Admins can also see a full list of all Alarms that have
>>>>> taken place in the past."
>>>>> 
>>>>> In ceilometer terminology, we refer to this as alarm history
>>>>> or alarm change events.
>>>>> 
>>>>> * "CPU Utilization exceeded 80%."
>>>>> 
>>>>> Again good to capture the duration in that description of the
>>>>> event.
>>>>> 
>>>>> * "Within the Overview section, there should be a new tab that allows the
>>>>> user to click and view all Alarms that have occurred in their
>>>>> environment."
>>>>> 
>>>>> Not sure really what "environment" means here. Non-admin tenants only
>>>>> have visibility to their own alarm, whereas admins have visibility to
>>>>> all alarms.
>>>>> 
>>>>> * "This list would keep the latest  alarms."
>>>>> 
>>>>> Presumably this would be based on querying the alarm-history API,
>>>>> as opposed to an assumption that Horizon is consuming the actual
>>>>> alarm notifications?
>>>>> 
>>>>> Cheers,
>>>>> Eoghan
>>>>> 
>>>>> ----- Original Message -----
>>>>>> Hi All,
>>>>>> 
>>>>>> I’ve recently put together a set of wireframes[1] around Alarm
>>>>>> Management
>>>>>> that would support the following blueprint:
>>>>>> https://blueprints.launchpad.net/horizon/+spec/ceilometer-alarm-management-page
>>>>>> 
>>>>>> If you have a chance it would be great to hear any feedback that folks
>>>>>> have
>>>>>> on this direction moving forward with Alarms.
>>>>>> 
>>>>>> Best,
>>>>>> Liz
>>>>>> 
>>>>>> [1]
>>>>>> http://people.redhat.com/~lsurette/OpenStack/Alarm%20Management%20-%202014-05-30.pdf
>>>>>> 
>>>>>> _______________________________________________
>>>>>> OpenStack-dev mailing list
>>>>>> OpenStack-dev at lists.openstack.org
>>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> OpenStack-dev mailing list
>>>>> OpenStack-dev at lists.openstack.org
>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>> 
>>> 
>> 
>> 




More information about the OpenStack-dev mailing list