[aodh] [heat] Stein: How to create alarms based on rate metrics like CPU utilization?

4 Aug 2019

      Prior to Stein, Ceilometer issued a metric named /cpu_util/, which I 
could use to trigger alarms and autoscaling when CPU utilization was too 
high.

cpu_util doesn't exist anymore. Instead, we are asked to use Gnocchi's 
/rate/ feature. However, when using rates, alarms on a group of 
resources require more parameters than just one metric: Both an 
aggregation and a reaggregation method are needed.

For example, a group of instances that implement "myapp":

gnocchi measures aggregation -m cpu --reaggregation mean --aggregation 
rate:mean --query server_group=myapp --resource-type instance

Actually, this command uses a deprecated API (but from what I can see, 
Aodh still uses it). The new way is like this:

gnocchi aggregates --resource-type instance '(aggregate rate:mean 
(metric cpu mean))' server_group=myapp

If rate:mean is in the archive policy, it also works the other way around:

gnocchi aggregates --resource-type instance '(aggregate mean (metric cpu 
rate:mean))' server_group=myapp

Without reaggregation, I get quite unexpected numbers, including 
negative CPU rates. If you want to understand why, see this discussion 
with one of the Gnocchi maintainers [1].

*My problem*: Aodh allows me to set an aggregation method, but not a 
reaggregation method. How can I create alarms based on rates? The 
problem extends to Heat and autoscaling.

Thanks much,

Bernd.

[1] https://github.com/gnocchixyz/gnocchi/issues/1044

Bernd Bausch

Duc Truong

Witek Bedyk

Rico Lin

tags

participants (4)