[Openstack] [Ceilometer/Heat in Havana]: Should autoscaling groups work already?

Eoghan Glynn eglynn at redhat.com
Thu Sep 12 13:32:11 UTC 2013


> Hi,
> 
> Many thanks, that helped. I no longer see those errors in the log.

Cool, further responses below.
 
> However, even though cpu_util metric data gets inserted to the ceilometer
> db with values bigger than the threshold defined for scaling-up in the
> template (template used can be seen here:
> https://bugs.launchpad.net/heat/+bug/1223710), no actions get triggered.
> All I see in the ceilometer-alarm-singleton log are "initiating evaluation
> cycle on 2 alarms" entries. No entries in ceilometer-alarm-notifier log.
> 
> Any ideas / pointers what could be the reason for this...?
> 
> ---clip---
> 2013-09-12 14:50:30.800 17868 INFO ceilometer.alarm.threshold_evaluation
> [-] initiating evaluation cycle on 2 alarms
> 2013-09-12 14:50:30.811 17868 DEBUG ceilometer.alarm.threshold_evaluation
> [-] evaluating alarm 7676439e-7ed5-4748-80ca-e161083ae8ff evaluate
> /opt/stack/ceilometer/ceilometer/alarm/threshold_evaluation.py:215
> 2013-09-12 14:50:30.816 17868 DEBUG ceilometer.alarm.threshold_evaluation
> [-] query stats from 2013-09-12 11:49:30.815225 to 2013-09-12
> 11:50:30.815225 _bound_duration
> /opt/stack/ceilometer/ceilometer/alarm/threshold_evaluation.py:99
> 2013-09-12 14:50:30.841 17868 DEBUG ceilometer.alarm.threshold_evaluation
> [-] stats query [{'field': u'metadata.user_metadata.server_group', 'value':
> u'Group_A', 'op': 'eq'}, {'field': 'timestamp', 'value':
> '2013-09-12T11:50:30.815225', 'op': 'le'}, {'field': 'timestamp', 'value':
> '2013-09-12T11:49:30.815225', 'op': 'ge'}] _statistics
> /opt/stack/ceilometer/ceilometer/alarm/threshold_evaluation.py:120
> 2013-09-12 14:50:30.847 17868 DEBUG ceilometerclient.common.http [-] curl
> -i -X GET -H 'X-Auth-Token:

So this ^^^ is the statistics query used to select cpu_util metrics for
all instances that are part of the autoscaling group - this is done on the
basis of the instance user metadata including an attribute called server_group
that's set to Group_A.

> 2013-09-12 14:50:31.206 17868 DEBUG ceilometer.alarm.threshold_evaluation
> [-] sanitize stats [] _sanitize
> /opt/stack/ceilometer/ceilometer/alarm/threshold_evaluation.py:111
> 2013-09-12 14:50:31.220 17868 DEBUG ceilometer.alarm.threshold_evaluation
> [-] pruned statistics to 0 _sanitize

Since the statistics returned are effectively empty, the first thing to check
is whether the existing instance has the correct user metadata set by Heat.

On the basis of the query above, I would expect the instance metadata to
be set as follows:

  {u'metering.server_group': u'Group_A'}

Can you check that Heat has the appropriate metadata, with a simple query like:

  for s in $(nova list | awk -F\| '/ACTIVE/ {print $2}') ; do nova show $s ; done | grep metadata


Secondly, looking at the timestamp constraint on the statistics query,
effectively equivalent to:

  timestamp >= 2013-09-12T11:49:30.815225 ; timestamp <= 2013-09-12T11:50:30.815225

we see that the threshold evaluator is looking back a total of a single
minute's duration. In order for the cpu_util samples to be available to meet
this requirement depends on the configured cadence for the collection of this
this meter. This defaults out-of-the-box to 600s, whether the alarm requires
it be 60s at  least. Check it's current value via:

  grep -A 1 cpu_pipeline /etc/ceilometer/pipeline.yaml 

The signature of this issue would be the alarm spending 9 of every
10 minutes in the insufficient_data state. This combined with the Heat
autosclaing cooldown is likely to cause scale-up to suppressed.

To resolve, simply edit as follows:

  sed -i '/cpu_pipeline/ {
    N
    s/interval: 600$/interval: 60/
  }' /etc/ceilometer/pipeline.yaml

and restart the ceilometer compute agent.

Cheers,
Eoghan


> /opt/stack/ceilometer/ceilometer/alarm/threshold_evaluation.py:115
> 2013-09-12 14:50:31.243 17868 DEBUG ceilometer.alarm.threshold_evaluation
> [-] evaluating alarm 15b78e83-c195-485d-a9f0-fc4d5a1c8fdf evaluate
> /opt/stack/ceilometer/ceilometer/alarm/threshold_evaluation.py:215
> 2013-09-12 14:50:31.252 17868 DEBUG ceilometer.alarm.threshold_evaluation
> [-] query stats from 2013-09-12 11:49:31.250748 to 2013-09-12
> 11:50:31.250748 _bound_duration
> /opt/stack/ceilometer/ceilometer/alarm/threshold_evaluation.py:99
> 2013-09-12 14:50:31.261 17868 DEBUG ceilometer.alarm.threshold_evaluation
> [-] stats query [{'field': u'metadata.user_metadata.server_group', 'value':
> u'Group_A', 'op': 'eq'}, {'field': 'timestamp', 'value':
> '2013-09-12T11:50:31.250748', 'op': 'le'}, {'field': 'timestamp', 'value':
> '2013-09-12T11:49:31.250748', 'op': 'ge'}] _statistics
> /opt/stack/ceilometer/ceilometer/alarm/threshold_evaluation.py:120
> ...
> 2013-09-12 14:50:31.391 17868 DEBUG ceilometer.alarm.threshold_evaluation
> [-] sanitize stats [] _sanitize
> /opt/stack/ceilometer/ceilometer/alarm/threshold_evaluation.py:111
> 2013-09-12 14:50:31.395 17868 DEBUG ceilometer.alarm.threshold_evaluation
> [-] pruned statistics to 0 _sanitize
> /opt/stack/ceilometer/ceilometer/alarm/threshold_evaluation.py:115
> --clap--
> 
> Br,
> -Juha
> 
> 
> On 12 September 2013 10:03, Eoghan Glynn <eglynn at redhat.com> wrote:
> 
> >
> >
> > Hi Juha,
> >
> > The problem you're encountering is a known restriction of the sqlalchemy
> > storage
> > driver, which doesn't yet provide the capability to select the statistics
> > for the
> > given Heat autoscaling group on which the scale up/down alarms are based
> > (the so-
> > called metaquery feature).
> >
> > In order for this feature to be present in the ceilometer API service,
> > you'll need
> > to use the mongodb storage driver instead.
> >
> > Thanks,
> > Eoghan
> >
> > ----- Original Message -----
> > > Hi,
> > >
> > > I met the problem when tried to be using autoscaling groups in heat
> > templates
> > > with havana (see:
> > > https://bugs.launchpad.net/heat/+bug/1223710 )
> > >
> > > Can anyone confirm whether the autoscaling should already work with
> > havana?
> > >
> > > Currently the evaluation of the ceilometer alarm/meter data seems to be
> > > failing:
> > >
> > >
> > >
> > > ceilometer-alarm-singleton:
> > > ======================
> > > 2013-09-11 10:16:28.074 5326 INFO ceilometer.alarm.threshold_evaluation
> > [-]
> > > initiating evaluation cycle on 3 alarms
> > > 2013-09-11 10:16:28.108 5326 ERROR ceilometer.alarm.threshold_evaluation
> > [-]
> > > alarm stats retrieval failed
> > > ...
> > > 2013-09-11 10:16:28.108 5326 TRACE ceilometer.alarm.threshold_evaluation
> > File
> > > "/opt/stack/python-ceilometerclient/ceilometerclient/v2/statistics.py",
> > line
> > > 29, in list
> > > 2013-09-11 10:16:28.108 5326 TRACE ceilometer.alarm.threshold_evaluation
> > > '/v2/meters/' + meter_name + '/statistics',
> > > 2013-09-11 10:16:28.108 5326 TRACE ceilometer.alarm.threshold_evaluation
> > > TypeError: cannot concatenate 'str' and 'NoneType' objects
> > >
> > > ceilometer-api:
> > > ===============
> > > 2013-09-11 10:16:28.221 4500 ERROR wsme.api [-] Server-side error:
> > "metaquery
> > > not implemented". Detail:
> > > Traceback (most recent call last):
> > >
> > > File "/usr/local/lib/python2.7/dist-packages/wsmeext/pecan.py", line 70,
> > in
> > > callfunction
> > > result = f(self, *args, **kwargs)
> > >
> > > File "/opt/stack/ceilometer/ceilometer/api/controllers/v2.py", line 693,
> > in
> > > statistics
> > > for c in computed]
> > >
> > > File "/opt/stack/ceilometer/ceilometer/storage/impl_sqlalchemy.py", line
> > 517,
> > > in get_meter_statistics
> > > query = self._make_stats_query(sample_filter, groupby)
> > >
> > > File "/opt/stack/ceilometer/ceilometer/storage/impl_sqlalchemy.py", line
> > 468,
> > > in _make_stats_query
> > > return make_query_from_filter(query, sample_filter)
> > >
> > > File "/opt/stack/ceilometer/ceilometer/storage/impl_sqlalchemy.py", line
> > 137,
> > > in make_query_from_filter
> > > raise NotImplementedError('metaquery not implemented')
> > >
> > > NotImplementedError: metaquery not implemented
> > >
> > >
> > > Many thanks,
> > > -Juha
> > >
> > > _______________________________________________
> > > Mailing list:
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> > > Post to     : openstack at lists.openstack.org
> > > Unsubscribe :
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> > >
> >
> 




More information about the Openstack mailing list