[Openstack] [Ceilometer/Heat in Havana]: Should autoscaling groups work already?
Juha Tynninen
tyky72 at gmail.com
Fri Sep 13 06:23:55 UTC 2013
Many thanks again.
Seems to be I was missing the Tags definition for AutoScalingGroup in the
template:
"Tags" : [ { "Key" : "metering.server_group", "Value" : "Group_A" } ]
Added that and now I can see evaluation occurring and some scaling actions
triggered (some exceptions can be seen in the logs, but I'll continue
investigations).
Br,
-Juha
On 12 September 2013 16:32, Eoghan Glynn <eglynn at redhat.com> wrote:
>
> > Hi,
> >
> > Many thanks, that helped. I no longer see those errors in the log.
>
> Cool, further responses below.
>
> > However, even though cpu_util metric data gets inserted to the ceilometer
> > db with values bigger than the threshold defined for scaling-up in the
> > template (template used can be seen here:
> > https://bugs.launchpad.net/heat/+bug/1223710), no actions get triggered.
> > All I see in the ceilometer-alarm-singleton log are "initiating
> evaluation
> > cycle on 2 alarms" entries. No entries in ceilometer-alarm-notifier log.
> >
> > Any ideas / pointers what could be the reason for this...?
> >
> > ---clip---
> > 2013-09-12 14:50:30.800 17868 INFO ceilometer.alarm.threshold_evaluation
> > [-] initiating evaluation cycle on 2 alarms
> > 2013-09-12 14:50:30.811 17868 DEBUG ceilometer.alarm.threshold_evaluation
> > [-] evaluating alarm 7676439e-7ed5-4748-80ca-e161083ae8ff evaluate
> > /opt/stack/ceilometer/ceilometer/alarm/threshold_evaluation.py:215
> > 2013-09-12 14:50:30.816 17868 DEBUG ceilometer.alarm.threshold_evaluation
> > [-] query stats from 2013-09-12 11:49:30.815225 to 2013-09-12
> > 11:50:30.815225 _bound_duration
> > /opt/stack/ceilometer/ceilometer/alarm/threshold_evaluation.py:99
> > 2013-09-12 14:50:30.841 17868 DEBUG ceilometer.alarm.threshold_evaluation
> > [-] stats query [{'field': u'metadata.user_metadata.server_group',
> 'value':
> > u'Group_A', 'op': 'eq'}, {'field': 'timestamp', 'value':
> > '2013-09-12T11:50:30.815225', 'op': 'le'}, {'field': 'timestamp',
> 'value':
> > '2013-09-12T11:49:30.815225', 'op': 'ge'}] _statistics
> > /opt/stack/ceilometer/ceilometer/alarm/threshold_evaluation.py:120
> > 2013-09-12 14:50:30.847 17868 DEBUG ceilometerclient.common.http [-] curl
> > -i -X GET -H 'X-Auth-Token:
>
> So this ^^^ is the statistics query used to select cpu_util metrics for
> all instances that are part of the autoscaling group - this is done on the
> basis of the instance user metadata including an attribute called
> server_group
> that's set to Group_A.
>
> > 2013-09-12 14:50:31.206 17868 DEBUG ceilometer.alarm.threshold_evaluation
> > [-] sanitize stats [] _sanitize
> > /opt/stack/ceilometer/ceilometer/alarm/threshold_evaluation.py:111
> > 2013-09-12 14:50:31.220 17868 DEBUG
> ceilometer.alarm.threshold_evaluation
> > [-] pruned statistics to 0 _sanitize
>
> Since the statistics returned are effectively empty, the first thing to
> check
> is whether the existing instance has the correct user metadata set by Heat.
>
> On the basis of the query above, I would expect the instance metadata to
> be set as follows:
>
> {u'metering.server_group': u'Group_A'}
>
> Can you check that Heat has the appropriate metadata, with a simple query
> like:
>
> for s in $(nova list | awk -F\| '/ACTIVE/ {print $2}') ; do nova show $s
> ; done | grep metadata
>
>
> Secondly, looking at the timestamp constraint on the statistics query,
> effectively equivalent to:
>
> timestamp >= 2013-09-12T11:49:30.815225 ; timestamp <=
> 2013-09-12T11:50:30.815225
>
> we see that the threshold evaluator is looking back a total of a single
> minute's duration. In order for the cpu_util samples to be available to
> meet
> this requirement depends on the configured cadence for the collection of
> this
> this meter. This defaults out-of-the-box to 600s, whether the alarm
> requires
> it be 60s at least. Check it's current value via:
>
> grep -A 1 cpu_pipeline /etc/ceilometer/pipeline.yaml
>
> The signature of this issue would be the alarm spending 9 of every
> 10 minutes in the insufficient_data state. This combined with the Heat
> autosclaing cooldown is likely to cause scale-up to suppressed.
>
> To resolve, simply edit as follows:
>
> sed -i '/cpu_pipeline/ {
> N
> s/interval: 600$/interval: 60/
> }' /etc/ceilometer/pipeline.yaml
>
> and restart the ceilometer compute agent.
>
> Cheers,
> Eoghan
>
>
> > /opt/stack/ceilometer/ceilometer/alarm/threshold_evaluation.py:115
> > 2013-09-12 14:50:31.243 17868 DEBUG
> ceilometer.alarm.threshold_evaluation
> > [-] evaluating alarm 15b78e83-c195-485d-a9f0-fc4d5a1c8fdf evaluate
> > /opt/stack/ceilometer/ceilometer/alarm/threshold_evaluation.py:215
> > 2013-09-12 14:50:31.252 17868 DEBUG ceilometer.alarm.threshold_evaluation
> > [-] query stats from 2013-09-12 11:49:31.250748 to 2013-09-12
> > 11:50:31.250748 _bound_duration
> > /opt/stack/ceilometer/ceilometer/alarm/threshold_evaluation.py:99
> > 2013-09-12 14:50:31.261 17868 DEBUG ceilometer.alarm.threshold_evaluation
> > [-] stats query [{'field': u'metadata.user_metadata.server_group',
> 'value':
> > u'Group_A', 'op': 'eq'}, {'field': 'timestamp', 'value':
> > '2013-09-12T11:50:31.250748', 'op': 'le'}, {'field': 'timestamp',
> 'value':
> > '2013-09-12T11:49:31.250748', 'op': 'ge'}] _statistics
> > /opt/stack/ceilometer/ceilometer/alarm/threshold_evaluation.py:120
> > ...
> > 2013-09-12 14:50:31.391 17868 DEBUG ceilometer.alarm.threshold_evaluation
> > [-] sanitize stats [] _sanitize
> > /opt/stack/ceilometer/ceilometer/alarm/threshold_evaluation.py:111
> > 2013-09-12 14:50:31.395 17868 DEBUG ceilometer.alarm.threshold_evaluation
> > [-] pruned statistics to 0 _sanitize
> > /opt/stack/ceilometer/ceilometer/alarm/threshold_evaluation.py:115
> > --clap--
> >
> > Br,
> > -Juha
> >
> >
> > On 12 September 2013 10:03, Eoghan Glynn <eglynn at redhat.com> wrote:
> >
> > >
> > >
> > > Hi Juha,
> > >
> > > The problem you're encountering is a known restriction of the
> sqlalchemy
> > > storage
> > > driver, which doesn't yet provide the capability to select the
> statistics
> > > for the
> > > given Heat autoscaling group on which the scale up/down alarms are
> based
> > > (the so-
> > > called metaquery feature).
> > >
> > > In order for this feature to be present in the ceilometer API service,
> > > you'll need
> > > to use the mongodb storage driver instead.
> > >
> > > Thanks,
> > > Eoghan
> > >
> > > ----- Original Message -----
> > > > Hi,
> > > >
> > > > I met the problem when tried to be using autoscaling groups in heat
> > > templates
> > > > with havana (see:
> > > > https://bugs.launchpad.net/heat/+bug/1223710 )
> > > >
> > > > Can anyone confirm whether the autoscaling should already work with
> > > havana?
> > > >
> > > > Currently the evaluation of the ceilometer alarm/meter data seems to
> be
> > > > failing:
> > > >
> > > >
> > > >
> > > > ceilometer-alarm-singleton:
> > > > ======================
> > > > 2013-09-11 10:16:28.074 5326 INFO
> ceilometer.alarm.threshold_evaluation
> > > [-]
> > > > initiating evaluation cycle on 3 alarms
> > > > 2013-09-11 10:16:28.108 5326 ERROR
> ceilometer.alarm.threshold_evaluation
> > > [-]
> > > > alarm stats retrieval failed
> > > > ...
> > > > 2013-09-11 10:16:28.108 5326 TRACE
> ceilometer.alarm.threshold_evaluation
> > > File
> > > >
> "/opt/stack/python-ceilometerclient/ceilometerclient/v2/statistics.py",
> > > line
> > > > 29, in list
> > > > 2013-09-11 10:16:28.108 5326 TRACE
> ceilometer.alarm.threshold_evaluation
> > > > '/v2/meters/' + meter_name + '/statistics',
> > > > 2013-09-11 10:16:28.108 5326 TRACE
> ceilometer.alarm.threshold_evaluation
> > > > TypeError: cannot concatenate 'str' and 'NoneType' objects
> > > >
> > > > ceilometer-api:
> > > > ===============
> > > > 2013-09-11 10:16:28.221 4500 ERROR wsme.api [-] Server-side error:
> > > "metaquery
> > > > not implemented". Detail:
> > > > Traceback (most recent call last):
> > > >
> > > > File "/usr/local/lib/python2.7/dist-packages/wsmeext/pecan.py", line
> 70,
> > > in
> > > > callfunction
> > > > result = f(self, *args, **kwargs)
> > > >
> > > > File "/opt/stack/ceilometer/ceilometer/api/controllers/v2.py", line
> 693,
> > > in
> > > > statistics
> > > > for c in computed]
> > > >
> > > > File "/opt/stack/ceilometer/ceilometer/storage/impl_sqlalchemy.py",
> line
> > > 517,
> > > > in get_meter_statistics
> > > > query = self._make_stats_query(sample_filter, groupby)
> > > >
> > > > File "/opt/stack/ceilometer/ceilometer/storage/impl_sqlalchemy.py",
> line
> > > 468,
> > > > in _make_stats_query
> > > > return make_query_from_filter(query, sample_filter)
> > > >
> > > > File "/opt/stack/ceilometer/ceilometer/storage/impl_sqlalchemy.py",
> line
> > > 137,
> > > > in make_query_from_filter
> > > > raise NotImplementedError('metaquery not implemented')
> > > >
> > > > NotImplementedError: metaquery not implemented
> > > >
> > > >
> > > > Many thanks,
> > > > -Juha
> > > >
> > > > _______________________________________________
> > > > Mailing list:
> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> > > > Post to : openstack at lists.openstack.org
> > > > Unsubscribe :
> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> > > >
> > >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20130913/3d767e89/attachment.html>
More information about the Openstack
mailing list