Open Stack

Wed Jun 4 16:33:04 UTC 2014

Hi Liz,

Two further thoughts occurred to me after hitting send on
my previous mail.

First, is the concept of alarm dimensioning; see my RDO Ceilometer
getting started guide[1] for an explanation of that notion.

"A key associated concept is the notion of dimensioning which defines the set of matching meters that feed into an alarm evaluation. Recall that meters are per-resource-instance, so in the simplest case an alarm might be defined over a particular meter applied to all resources visible to a particular user. More useful however would the option to explicitly select which specific resources we're interested in alarming on. On one extreme we would have narrowly dimensioned alarms where this selection would have only a single target (identified by resource ID). On the other extreme, we'd have widely dimensioned alarms where this selection identifies many resources over which the statistic is aggregated, for example all instances booted from a particular image or all instances with matching user metadata (the latter is how Heat identifies autoscaling groups)."

We'd have to think about how that concept is captured in the
UX for alarm creation/update.

Second, there are a couple of more advanced alarming features 
that were added in Icehouse:

1. The ability to constrain alarms on time ranges, such that they
   would only fire say during 9-to-5 on a weekday. This would
   allow for example different autoscaling policies to be applied
   out-of-hours, when resource usage is likely to be cheaper and
   manual remediation less straight-forward.

2. The ability to exclude low-quality datapoints with anomolously
   low sample counts. This allows the leading edge of the trend of
   widely dimensioned alarms not to be skewed by eagerly-reporting
   outliers.

Perhaps not in a first iteration, but at some point it may make sense
to expose these more advanced features in the UI.

Cheers,
Eoghan

[1] http://openstack.redhat.com/CeilometerQuickStart

----- Original Message -----
> 
> Hi Liz,
> 
> Looks great!
> 
> Some thoughts on the wireframe doc:
> 
> * The description of form:
> 
>     "If CPU Utilization exceeds 80%, send alarm."
>   
>   misses the time-window aspect of the alarm definition.
> 
>   Whereas the boilerplate default descriptions generated by
>   ceilometer itself:
> 
>     "cpu_util > 70.0 during 3 x 600s"
> 
>   captures this important info.
> 
> * The metric names, e.g. "CPU Utilization", are not an exact
>   match for the meter names used by ceilometer, e.g. "cpu_util".
> 
> * Non-admin users can create alarms in ceilometer:
> 
>   "This is where admins can come in and
>    define and edit any alarms they want
>    the environment to use."
> 
>   (though these alarms will only have visibility onto the stats
>    that would be accessible to the user on behalf of whom the
>    alarm is being evaluated)
> 
> * There's no concept currently of alarm severity.
> 
> * "Should users be able to enable/dis-able alarms."
> 
>   Yes, the API allows for disabled (i.e. non-evaluated) alarms.
> 
> * "Should users be able to own/assign alarms?"
> 
>   Only admin users can create an alarm on behalf of another
>   user/tenant.
> 
> * "Should users be able to acknowledge, close alarms?"
> 
>   No, we have no concept of ACKing an alarm.
> 
> * "Admins can also see a full list of all Alarms that have
>    taken place in the past."
> 
>   In ceilometer terminology, we refer to this as alarm history
>   or alarm change events.
> 
> * "CPU Utilization exceeded 80%."
> 
>   Again good to capture the duration in that description of the
>   event.
> 
> * "Within the Overview section, there should be a new tab that allows the
>    user to click and view all Alarms that have occurred in their
>    environment."
> 
>   Not sure really what "environment" means here. Non-admin tenants only
>   have visibility to their own alarm, whereas admins have visibility to
>   all alarms.
> 
> * "This list would keep the latest  alarms."
> 
>   Presumably this would be based on querying the alarm-history API,
>   as opposed to an assumption that Horizon is consuming the actual
>   alarm notifications?
> 
> Cheers,
> Eoghan
> 
> ----- Original Message -----
> > Hi All,
> > 
> > I’ve recently put together a set of wireframes[1] around Alarm Management
> > that would support the following blueprint:
> > https://blueprints.launchpad.net/horizon/+spec/ceilometer-alarm-management-page
> > 
> > If you have a chance it would be great to hear any feedback that folks have
> > on this direction moving forward with Alarms.
> > 
> > Best,
> > Liz
> > 
> > [1]
> > http://people.redhat.com/~lsurette/OpenStack/Alarm%20Management%20-%202014-05-30.pdf
> > 
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

Open Stack

[openstack-dev] [Horizon] [UX] Design for Alarming and Alarm Management

OpenStack

Community

Documentation

Branding & Legal