[Openstack] [Ceilometer] looking for alarm best practice - please help
Eoghan Glynn
eglynn at redhat.com
Wed Dec 3 12:34:01 UTC 2014
> Hi folks,
>
>
>
> I wonder if anyone could share some best practice regarding to the usage of
> ceilometer alarm. We are using the alarm evaluation/notification of
> ceilometer and we don’t feel very well of the way we use it. Below is our
> problem:
>
>
>
> ============================
>
> Scenario:
>
> When cpu usage or memory usage above a certain threshold, alerts should be
> displayed on admin’s web page. There should be a 3 level alerts according to
> meter value, namely notice, warning, fatal. Notice means the meter value is
> between 50% ~ 70%, warning means between 70% ~ 85% and fatal means above 85%
>
> For example:
>
> * when one vm’s cpu usage is 72%, an alert message should be displayed saying
> “Warning: vm[d9b7018b-06c4-4fba-8221-37f67f6c6b8c] cpu usage is above 70%”.
>
> * when one vm’s memory usage is 90%, another alert message should be created
> saying “Fatal: vm[d9b7018b-06c4-4fba-8221-37f67f6c6b8c] memory usage is
> above 85%”
>
>
>
> Our current Solution:
>
> We used ceilometer alarm evaluation/notification to implement this. To
> distinguish which VM and which meter is above what value, we’ve created one
> alarm for each VM by each condition. So, to monitor 1 VM, 6 alarms will be
> created because there are 2 meters and for each meter there are 3 levels.
> That means, if there are 100 VMs to be monitored, 600 alarms will be
> created.
>
>
>
> Problems:
>
> * The first problem is, when the number of meters increases, the number of
> alarms will be multiplied. For example, customer may want alerts on disk and
> network IO rates, and if we do that, there will be 4*3=12 alarms for each
> VM.
>
> * The second problem is, when one VM is created, multiple alarms will be
> created, meaning multiple http requests will be fired. In the case above, 6
> HTTP requests will be needed once a VM is created. And this number also
> increases as the number of meters goes up.
One way of reducing both the number of alarms and the volume of notifications
would be to group related VMs, if such a concept exists in your use-case.
This is effectively how Heat autoscaling uses ceilometer, alarming on the
average of some statistic over a set of instances (as opposed to triggering
on individual instances).
The VMs could be grouped by setting user-metadata of form:
nova boot ... --meta metering.my_server_group=foobar
Any user-metadata prefixed with 'metering.' will be preserved by ceilometer
in the resource_metadata.user_metedata stored for each sample, so that it
can used to select the statistics on which the alarm is based, e.g.
ceilometer alarm-threshold-create --name cpu_high_foobar \
--description 'warning: foobar instance group running hot' \
--meter-name cpu_util --threshold 70.0 \
--comparison-operator gt --statistic avg \
...
--query metadata.user_metedata.my_server_group=foobar
This approach is of course predicated on the there being some natural
grouping relation between instances in your environment.
Cheers,
Eoghan
> =============================
>
>
>
> Do anyone have any suggestions?
>
>
>
>
>
>
>
> Best Regards!
>
> Kurt Rao
>
>
> _______________________________________________
> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to : openstack at lists.openstack.org
> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>
More information about the Openstack
mailing list