Open Stack

Tue Feb 25 16:43:40 UTC 2014

> Hi,
> 
> >>>>The simplest explanation is that you're seeing a slight offset
> >>>>between the cadence of cpu_util gathering (15s) and the less rapid
> >>>>alarm evaluation interval (30s, right?).
> 
> You mean evaluation_interval in ceilometer.conf?

Yes, in the alarm section of the /etc/ceilometer/ceilometer.conf:

  [alarm]
  evaluation_interval = 15

> That's set to 15. So as an
> summary different periods currently in use:
> 
> ceilometer.conf in controller:
> - evaluation_interval=15

OK, I thought that was 30s from your earlier mails.

No matter, an offset effect (albeit smaller) can still occur.

> pipeline.yaml on compute nodes:
> - name: cpu_pipeline
>   interval: 15
> 
> OS::Ceilometer::Alarm
> - period 30, evaluation periods 1

So every 15s, the alarm is evaluated with a sliding time window of 30s
into the past.

So a time T, the alarm evaluator looks at the statistics over the duration:

  (T-30s,T)

then 15s later, it looks at:

  (T-15s,T+15s)

then a further 15s later:

  (T,T+30s)

etc.

> AWS::AutoScaling::ScalingPolicy
> - Cooldown: 60
> 
> I'm still a little bit lost what kind of side effects these different
> periods may have with each other.
> But would you have any suggestions how I could fine tune my values so that
> I would eliminate the

If the semantics of Heat autoscaling cooldown are identical to the analogous
AWS concept (i.e. the period after a scaling activity ends and before another
scaling activity can start) then simply increasing the cooldown period will
have the effect you seek (i.e. the avoidance of the second questionable scale-up
action with the launch of instance GroupA-2).

> possibility of extra instances to be started or that bumping the instance
> count continuously up and down occurs.
> I'm just trying to get upscaling /downscaling to occur in a controlled
> manner (fast if possible).
> 
> Also one other question: When you check the stats with...
> ceilometer statistics -m cpu_util -q
> metadata.user_metadata.server_group=Group_A -p 30
> ...command, with what value should -p match to? The period in
> OS::Ceilometer::Alarm resource?

Yes, as the intent is to replicate the stats query issued by the alarm
alarm evaluator.

> >>> $ ceilometer alarm-list | grep GroupA
> >>>  $ ceilometer alarm-history -a $ALARM_ID
> 
> I must have some older ceilometer client installed, mine doesn't know
> alarm-history options.

Alarm history has been supported by the client since 1.0.6:

  https://pypi.python.org/pypi/python-ceilometerclient/1.0.6

> Also alarm-list output doesn't have Group name in its output.

Can you copy the alarm-list output (after you upgrade your CLI).

/Eoghan

> Thanks again. I really appreciate all your help.
> 
> Br,
> -Juha
> 
> 
> On 25 February 2014 15:22, Eoghan Glynn <eglynn at redhat.com> wrote:
> 
> >
> > K, eglynn deploys Occam's razor ...
> >
> > The simplest explanation is that you're seeing a slight offset
> > between the cadence of cpu_util gathering (15s) and the less rapid
> > alarm evaluation interval (30s, right?).
> >
> > These periods are not in lock-step.
> >
> > Let's interleave the statistics with the instance launch times:
> >
> > | 30     | 2014-02-25T11:09:28 | 2014-02-25T11:09:58 | 2     |
> > 118.333333333 | 120.133333333 | 238.466666667 | 119.233333333 | 15.0     |
> > 2014-02-25T11:09:41 | 2014-02-25T11:09:56 |
> >
> > Group_A-1: 2014-02-25T11:09:57.000000
> > Should it have occurred: yes
> > The average is 119.233333333, calculated from two samples from the same
> > original instance.
> >
> > | 30     | 2014-02-25T11:09:58 | 2014-02-25T11:10:28 | 3     |
> > 34.3333333333 | 120.133333333 | 274.266666667 | 91.4222222222 | 15.0     |
> > 2014-02-25T11:10:11 | 2014-02-25T11:10:26 |
> > | 30     | 2014-02-25T11:10:28 | 2014-02-25T11:10:58 | 4     | 17.0
> >    | 120.866666667 | 274.933333333 | 68.7333333333 | 20.0     |
> > 2014-02-25T11:10:36 | 2014-02-25T11:10:56 |
> >
> > Group_A-2: 2014-02-25T11:11:12.000000
> > Should it have occurred: arguable
> > The average is now 68.7333333333 calculated from two samples each from the
> > original and scale-up instance.
> > So the low-watermark alarm will not have fired (as 68.7333333333 >= 30),
> > hence scale-down actions should not have occurred.
> > The high-watermark alarm will shortly revert to OK, but since the the
> > alarm evaluation period is presumably 30s, that may take a few more seconds
> > before that alarm state transition occurs and is reported to heat.
> >
> > | 30     | 2014-02-25T11:10:58 | 2014-02-25T11:11:28 | 4     |
> > 16.3333333333 | 109.666666667 | 234.466666667 | 58.6166666667 | 20.0     |
> > 2014-02-25T11:11:06 | 2014-02-25T11:11:26 |
> > | 30     | 2014-02-25T11:11:28 | 2014-02-25T11:11:58 | 6     |
> > 13.5333333333 | 107.4         | 280.133333333 | 46.6888888889 | 20.0     |
> > 2014-02-25T11:11:36 | 2014-02-25T11:11:56 |
> > ...
> >
> > You can confirm that the above sequence of events occurred by looking
> > at the relevant alarm history:
> >
> >   $ ceilometer alarm-list | grep GroupA
> >   $ ceilometer alarm-history -a $ALARM_ID
> >
> > /Eoghan
> >
> >
> > ----- Original Message -----
> > > Hi,
> > >
> > > Here are the statistics:
> > >
> > > Launch times for VM instances (didn't expect Group_A-2 to be started):
> > > Group_A-0: 2014-02-25T11:05:31.000000
> > > Group_A-1: 2014-02-25T11:09:57.000000
> > > Group_A-2: 2014-02-25T11:11:12.000000
> > >
> > >
> > +--------+---------------------+---------------------+-------+---------------+---------------+---------------+---------------+----------+---------------------+---------------------+
> > > | Period | Period Start        | Period End          | Count | Min
> > >   | Max           | Sum           | Avg           | Duration | Duration
> > > Start      | Duration End        |
> > >
> > +--------+---------------------+---------------------+-------+---------------+---------------+---------------+---------------+----------+---------------------+---------------------+
> > > ...
> > > | 30     | 2014-02-25T11:05:28 | 2014-02-25T11:05:58 | 1     |
> > > 34.2666666667 | 34.2666666667 | 34.2666666667 | 34.2666666667 | 0.0
> >  |
> > > 2014-02-25T11:05:56 | 2014-02-25T11:05:56 |
> > > | 30     | 2014-02-25T11:05:58 | 2014-02-25T11:06:28 | 2     | 16.2
> > >  | 18.4666666667 | 34.6666666667 | 17.3333333333 | 15.0     |
> > > 2014-02-25T11:06:11 | 2014-02-25T11:06:26 |
> > > | 30     | 2014-02-25T11:06:28 | 2014-02-25T11:06:58 | 2     | 16.8
> > >  | 19.5333333333 | 36.3333333333 | 18.1666666667 | 15.0     |
> > > 2014-02-25T11:06:41 | 2014-02-25T11:06:56 |
> > > | 30     | 2014-02-25T11:06:58 | 2014-02-25T11:07:28 | 2     | 16.0
> > >  | 16.6          | 32.6          | 16.3          | 15.0     |
> > > 2014-02-25T11:07:11 | 2014-02-25T11:07:26 |
> > > | 30     | 2014-02-25T11:07:28 | 2014-02-25T11:07:58 | 2     |
> > > 18.0666666667 | 19.0666666667 | 37.1333333333 | 18.5666666667 | 15.0
> > |
> > > 2014-02-25T11:07:41 | 2014-02-25T11:07:56 |
> > > | 30     | 2014-02-25T11:07:58 | 2014-02-25T11:08:28 | 2     |
> > > 19.6666666667 | 19.7333333333 | 39.4          | 19.7          | 15.0
> > |
> > > 2014-02-25T11:08:11 | 2014-02-25T11:08:26 |
> > > | 30     | 2014-02-25T11:08:28 | 2014-02-25T11:08:58 | 2     |
> > > 31.9333333333 | 120.2         | 152.133333333 | 76.0666666667 | 15.0
> > |
> > > 2014-02-25T11:08:41 | 2014-02-25T11:08:56 |
> > > | 30     | 2014-02-25T11:08:58 | 2014-02-25T11:09:28 | 2     |
> > > 118.866666667 | 120.066666667 | 238.933333333 | 119.466666667 | 15.0
> > |
> > > 2014-02-25T11:09:11 | 2014-02-25T11:09:26 |
> > > | 30     | 2014-02-25T11:09:28 | 2014-02-25T11:09:58 | 2     |
> > > 118.333333333 | 120.133333333 | 238.466666667 | 119.233333333 | 15.0
> > |
> > > 2014-02-25T11:09:41 | 2014-02-25T11:09:56 |
> > > | 30     | 2014-02-25T11:09:58 | 2014-02-25T11:10:28 | 3     |
> > > 34.3333333333 | 120.133333333 | 274.266666667 | 91.4222222222 | 15.0
> > |
> > > 2014-02-25T11:10:11 | 2014-02-25T11:10:26 |
> > > | 30     | 2014-02-25T11:10:28 | 2014-02-25T11:10:58 | 4     | 17.0
> > >  | 120.866666667 | 274.933333333 | 68.7333333333 | 20.0     |
> > > 2014-02-25T11:10:36 | 2014-02-25T11:10:56 |
> > > | 30     | 2014-02-25T11:10:58 | 2014-02-25T11:11:28 | 4     |
> > > 16.3333333333 | 109.666666667 | 234.466666667 | 58.6166666667 | 20.0
> > |
> > > 2014-02-25T11:11:06 | 2014-02-25T11:11:26 |
> > > | 30     | 2014-02-25T11:11:28 | 2014-02-25T11:11:58 | 6     |
> > > 13.5333333333 | 107.4         | 280.133333333 | 46.6888888889 | 20.0
> > |
> > > 2014-02-25T11:11:36 | 2014-02-25T11:11:56 |
> > > | 30     | 2014-02-25T11:11:58 | 2014-02-25T11:12:28 | 6     |
> > > 14.5333333333 | 107.4         | 275.0         | 45.8333333333 | 20.0
> > |
> > > 2014-02-25T11:12:06 | 2014-02-25T11:12:26 |
> > > | 30     | 2014-02-25T11:12:28 | 2014-02-25T11:12:58 | 6     |
> > > 13.7333333333 | 107.266666667 | 278.2         | 46.3666666667 | 20.0
> > |
> > > 2014-02-25T11:12:36 | 2014-02-25T11:12:56 |
> > > | 30     | 2014-02-25T11:12:58 | 2014-02-25T11:13:28 | 6     |
> > > 13.0666666667 | 107.466666667 | 277.666666667 | 46.2777777778 | 20.0
> > |
> > > 2014-02-25T11:13:06 | 2014-02-25T11:13:26 |
> > > | 30     | 2014-02-25T11:13:28 | 2014-02-25T11:13:58 | 6     | 13.6
> > >  | 106.866666667 | 269.129166667 | 44.8548611111 | 21.0     |
> > > 2014-02-25T11:13:36 | 2014-02-25T11:13:57 |
> > > | 30     | 2014-02-25T11:13:58 | 2014-02-25T11:14:28 | 6     |
> > > 14.2666666667 | 113.928571429 | 276.286904762 | 46.0478174603 | 21.0
> > |
> > > 2014-02-25T11:14:06 | 2014-02-25T11:14:27 |
> > > | 30     | 2014-02-25T11:14:28 | 2014-02-25T11:14:58 | 6     |
> > > 13.3333333333 | 116.0         | 280.2125      | 46.7020833333 | 21.0
> > |
> > > 2014-02-25T11:14:36 | 2014-02-25T11:14:57 |
> > > | 30     | 2014-02-25T11:14:58 | 2014-02-25T11:15:28 | 6     |
> > > 13.7333333333 | 108.133333333 | 279.708333333 | 46.6180555556 | 21.0
> > |
> > > 2014-02-25T11:15:06 | 2014-02-25T11:15:27 |
> > > | 30     | 2014-02-25T11:15:28 | 2014-02-25T11:15:58 | 6     |
> > > 13.5333333333 | 108.666666667 | 278.866666667 | 46.4777777778 | 21.0
> > |
> > > 2014-02-25T11:15:36 | 2014-02-25T11:15:57 |
> > > | 30     | 2014-02-25T11:15:58 | 2014-02-25T11:16:28 | 6     |
> > > 13.5333333333 | 106.466666667 | 276.933333333 | 46.1555555556 | 21.0
> > |
> > > 2014-02-25T11:16:06 | 2014-02-25T11:16:27 |
> > > | 30     | 2014-02-25T11:16:28 | 2014-02-25T11:16:58 | 6     |
> > > 13.3333333333 | 106.466666667 | 276.8         | 46.1333333333 | 21.0
> > |
> > > 2014-02-25T11:16:36 | 2014-02-25T11:16:57 |
> > > | 30     | 2014-02-25T11:16:58 | 2014-02-25T11:17:28 | 6     |
> > > 13.8666666667 | 105.733333333 | 277.666666667 | 46.2777777778 | 21.0
> > |
> > > 2014-02-25T11:17:06 | 2014-02-25T11:17:27 |
> > > | 30     | 2014-02-25T11:17:28 | 2014-02-25T11:17:58 | 6     |
> > > 13.5333333333 | 106.933333333 | 277.666666667 | 46.2777777778 | 21.0
> > |
> > > 2014-02-25T11:17:36 | 2014-02-25T11:17:57 |
> > > ...
> > >
> > > Br,
> > > -Juha
> > >
> > >
> > > On 25 February 2014 13:53, Eoghan Glynn <eglynn at redhat.com> wrote:
> > >
> > > >
> > > > Juha,
> > > >
> > > > What are the actual average cpu_util stats for those periods
> > > > in which scaling occurred or did not occur contrary to your
> > > > expectations?
> > > >
> > > > I mean, as reported by the ceilometer API, as opposed to being
> > > > totted up manually:
> > > >
> > > >   $ ceilometer statistics -m cpu_util -q
> > > > metadata.user_metadata.server_group=GroupA -p 30
> > > >
> > > > Cheers,
> > > > Eoghan
> > > >
> > > > ----- Original Message -----
> > > > > Hi,
> > > > >
> > > > > Many thanks again.
> > > > >
> > > > > I fine tuned a little bit the heat template having now defined:
> > > > >
> > > > > Scale up: period 30, threshold 90, comparison_operator gt, statistic
> > avg,
> > > > > cooldown 60
> > > > > Scale down: period 30, threshold 30, comparison_operator lt,
> > statistic
> > > > avg,
> > > > > cooldown 60
> > > > >
> > > > > Workflow:
> > > > > - create stack -> GroupA-0 instance is started
> > > > > - generate load inside GroupA-0 -> cpu_util counter increses to >100%
> > > > > - Group_A-1 instance gets automatically started and after a short
> > while
> > > > > also Group_A-2
> > > > > - Situation remains the same, 3 instances keep on running, no down
> > > > scaling
> > > > > occurs.
> > > > >
> > > > > According to current thresholds down scaling shouldn't occur any
> > longer
> > > > > since:
> > > > > 106.933333333 + 18.3333333333 + 13.5333333333 = ~139. And 139 / 3 =
> > 46
> > > > >
> > > > > ...but I don't now see why the Group_A-2 was started in the first
> > place.
> > > > >
> > > > > cpu_util counters received:
> > > > >
> > > > > Group_A-0:
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 34.2666666667 |
> > > > > % | 2014-02-25T11:05:56 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge | 16.2 | %
> > |
> > > > > 2014-02-25T11:06:11 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 18.4666666667 |
> > > > > % | 2014-02-25T11:06:26 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 19.5333333333 |
> > > > > % | 2014-02-25T11:06:41 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge | 16.8 | %
> > |
> > > > > 2014-02-25T11:06:56 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge | 16.6 | %
> > |
> > > > > 2014-02-25T11:07:11 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge | 16.0 | %
> > |
> > > > > 2014-02-25T11:07:26 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 19.0666666667 |
> > > > > % | 2014-02-25T11:07:41 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 18.0666666667 |
> > > > > % | 2014-02-25T11:07:56 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 19.6666666667 |
> > > > > % | 2014-02-25T11:08:11 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 19.7333333333 |
> > > > > % | 2014-02-25T11:08:26 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 31.9333333333 |
> > > > > % | 2014-02-25T11:08:41 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge | 120.2 |
> > % |
> > > > > 2014-02-25T11:08:56 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 118.866666667 |
> > > > > % | 2014-02-25T11:09:11 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 120.066666667 |
> > > > > % | 2014-02-25T11:09:26 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 120.133333333 |
> > > > > % | 2014-02-25T11:09:41 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 118.333333333 |
> > > > > % | 2014-02-25T11:09:56 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge | 119.8 |
> > % |
> > > > > 2014-02-25T11:10:11 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 120.133333333 |
> > > > > % | 2014-02-25T11:10:26 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge | 118.4 |
> > % |
> > > > > 2014-02-25T11:10:41 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 120.866666667 |
> > > > > % | 2014-02-25T11:10:56 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 109.666666667 |
> > > > > % | 2014-02-25T11:11:11 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge | 89.6 | %
> > |
> > > > > 2014-02-25T11:11:26 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge | 107.4 |
> > % |
> > > > > 2014-02-25T11:11:41 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 106.333333333 |
> > > > > % | 2014-02-25T11:11:56 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 103.533333333 |
> > > > > % | 2014-02-25T11:12:11 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge | 107.4 |
> > % |
> > > > > 2014-02-25T11:12:26 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge | 105.8 |
> > % |
> > > > > 2014-02-25T11:12:41 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 107.266666667 |
> > > > > % | 2014-02-25T11:12:56 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 107.466666667 |
> > > > > % | 2014-02-25T11:13:11 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 105.066666667 |
> > > > > % | 2014-02-25T11:13:26 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 106.866666667 |
> > > > > % | 2014-02-25T11:13:41 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge | 102.0625
> > | %
> > > > |
> > > > > 2014-02-25T11:13:57 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 113.928571429 |
> > > > > % | 2014-02-25T11:14:11 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge | 97.625 |
> > % |
> > > > > 2014-02-25T11:14:27 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge | 116.0 |
> > % |
> > > > > 2014-02-25T11:14:41 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge | 100.8125
> > | %
> > > > |
> > > > > 2014-02-25T11:14:57 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 108.133333333 |
> > > > > % | 2014-02-25T11:15:12 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge | 105.4 |
> > % |
> > > > > 2014-02-25T11:15:27 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 105.533333333 |
> > > > > % | 2014-02-25T11:15:42 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 108.666666667 |
> > > > > % | 2014-02-25T11:15:57 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 106.466666667 |
> > > > > % | 2014-02-25T11:16:12 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 106.133333333 |
> > > > > % | 2014-02-25T11:16:27 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge | 106.4 |
> > % |
> > > > > 2014-02-25T11:16:42 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 106.466666667 |
> > > > > % | 2014-02-25T11:16:57 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 105.733333333 |
> > > > > % | 2014-02-25T11:17:12 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 105.466666667 |
> > > > > % | 2014-02-25T11:17:27 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 103.666666667 |
> > > > > % | 2014-02-25T11:17:42 |
> > > > > | a7e07a40-45fe-48d2-81c5-412435708c98 | cpu_util | gauge |
> > > > 106.933333333 |
> > > > > % | 2014-02-25T11:17:57 |
> > > > > Group_A-1
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge |
> > > > 34.3333333333 |
> > > > > % | 2014-02-25T11:10:21 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge | 17.0 | %
> > |
> > > > > 2014-02-25T11:10:36 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge |
> > > > 18.6666666667 |
> > > > > % | 2014-02-25T11:10:51 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge |
> > > > 18.8666666667 |
> > > > > % | 2014-02-25T11:11:06 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge |
> > > > 16.3333333333 |
> > > > > % | 2014-02-25T11:11:21 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge |
> > > > 19.4666666667 |
> > > > > % | 2014-02-25T11:11:36 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge |
> > > > 19.0666666667 |
> > > > > % | 2014-02-25T11:11:51 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge |
> > > > 18.5333333333 |
> > > > > % | 2014-02-25T11:12:06 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge | 16.4 | %
> > |
> > > > > 2014-02-25T11:12:21 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge |
> > > > 17.9333333333 |
> > > > > % | 2014-02-25T11:12:36 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge |
> > > > 19.6666666667 |
> > > > > % | 2014-02-25T11:12:51 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge |
> > > > 20.1333333333 |
> > > > > % | 2014-02-25T11:13:06 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge | 17.2 | %
> > |
> > > > > 2014-02-25T11:13:21 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge | 16.4 | %
> > |
> > > > > 2014-02-25T11:13:36 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge |
> > > > 16.5333333333 |
> > > > > % | 2014-02-25T11:13:51 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge |
> > > > 18.6666666667 |
> > > > > % | 2014-02-25T11:14:06 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge |
> > > > 17.4666666667 |
> > > > > % | 2014-02-25T11:14:21 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge | 16.8 | %
> > |
> > > > > 2014-02-25T11:14:36 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge | 19.4 | %
> > |
> > > > > 2014-02-25T11:14:51 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge |
> > > > 19.3333333333 |
> > > > > % | 2014-02-25T11:15:06 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge |
> > > > 18.7333333333 |
> > > > > % | 2014-02-25T11:15:21 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge | 17.8 | %
> > |
> > > > > 2014-02-25T11:15:36 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge |
> > > > 19.3333333333 |
> > > > > % | 2014-02-25T11:15:51 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge | 19.4 | %
> > |
> > > > > 2014-02-25T11:16:06 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge |
> > > > 16.8666666667 |
> > > > > % | 2014-02-25T11:16:21 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge | 16.8 | %
> > |
> > > > > 2014-02-25T11:16:36 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge |
> > > > 20.2666666667 |
> > > > > % | 2014-02-25T11:16:51 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge | 19.4 | %
> > |
> > > > > 2014-02-25T11:17:06 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge | 18.6 | %
> > |
> > > > > 2014-02-25T11:17:21 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge | 20.2 | %
> > |
> > > > > 2014-02-25T11:17:36 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge |
> > > > 18.8666666667 |
> > > > > % | 2014-02-25T11:17:51 |
> > > > > | ec1d377e-fd24-4246-8099-e558858d2cf4 | cpu_util | gauge |
> > > > 18.3333333333 |
> > > > > % | 2014-02-25T11:18:06 |
> > > > > Group_A-2:
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge |
> > > > 13.5333333333 |
> > > > > % | 2014-02-25T11:11:41 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge |
> > > > 14.3333333333 |
> > > > > % | 2014-02-25T11:11:56 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge |
> > > > 14.5333333333 |
> > > > > % | 2014-02-25T11:12:11 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge | 14.6 | %
> > |
> > > > > 2014-02-25T11:12:26 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge | 13.8 | %
> > |
> > > > > 2014-02-25T11:12:41 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge |
> > > > 13.7333333333 |
> > > > > % | 2014-02-25T11:12:56 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge |
> > > > 13.0666666667 |
> > > > > % | 2014-02-25T11:13:11 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge |
> > > > 14.7333333333 |
> > > > > % | 2014-02-25T11:13:26 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge | 13.6 | %
> > |
> > > > > 2014-02-25T11:13:41 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge |
> > > > 13.6666666667 |
> > > > > % | 2014-02-25T11:13:56 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge |
> > > > 14.2666666667 |
> > > > > % | 2014-02-25T11:14:11 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge |
> > > > 14.3333333333 |
> > > > > % | 2014-02-25T11:14:26 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge |
> > > > 13.3333333333 |
> > > > > % | 2014-02-25T11:14:41 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge |
> > > > 13.8666666667 |
> > > > > % | 2014-02-25T11:14:56 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge |
> > > > 13.7333333333 |
> > > > > % | 2014-02-25T11:15:11 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge | 14.375 |
> > % |
> > > > > 2014-02-25T11:15:27 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge | 14.0 | %
> > |
> > > > > 2014-02-25T11:15:42 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge |
> > > > 13.5333333333 |
> > > > > % | 2014-02-25T11:15:57 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge |
> > > > 13.5333333333 |
> > > > > % | 2014-02-25T11:16:12 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge |
> > > > 14.5333333333 |
> > > > > % | 2014-02-25T11:16:27 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge |
> > > > 13.3333333333 |
> > > > > % | 2014-02-25T11:16:42 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge |
> > > > 13.5333333333 |
> > > > > % | 2014-02-25T11:16:57 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge |
> > > > 13.8666666667 |
> > > > > % | 2014-02-25T11:17:12 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge | 14.6 | %
> > |
> > > > > 2014-02-25T11:17:27 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge |
> > > > 14.4666666667 |
> > > > > % | 2014-02-25T11:17:42 |
> > > > > | 2505efaf-0656-44a3-8dad-f694251980fd | cpu_util | gauge |
> > > > 13.5333333333 |
> > > > > % | 2014-02-25T11:17:57 |
> > > > > Br,
> > > > > -Juha
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On 25 February 2014 11:22, Eoghan Glynn <eglynn at redhat.com> wrote:
> > > > >
> > > > > >
> > > > > > Juha,
> > > > > >
> > > > > > What is the actual cpu_util trend looking like about the time
> > > > > > upscaling occurs?
> > > > > >
> > > > > > In the original template you provided, the cooldown period was set
> > > > > > so as to be quite short (IIRC, 20s).
> > > > > >
> > > > > > So if your artificial load on the first instance drives the
> > cpu_util
> > > > > > above the high-water-mark alarm threshold, e.g. to say 91%, then
> > the
> > > > > > newly launched instance has little load to contend with, giving an
> > > > > > average cpu_util of the instance group of ~46%, then the continual
> > > > > > scale-up/scale-down thrashing that you see is just autoscaling
> > doing
> > > > > > exactly what you've told it to do.
> > > > > >
> > > > > > To avoid this, you'll need to:
> > > > > >
> > > > > > * ensure that the "load" is spread across the current instance
> > group
> > > > > >   members in a roughly fair distribution (this is often achieved in
> > > > > >   practice using a load balancer randomizing or round-robining)
> > > > > >
> > > > > > * increase the cooldown period to allow the load distribution to
> > > > > >   "settle" after a scaling operation has taken place
> > > > > >
> > > > > > * ensure that the low-water-mark alarm threshold is sufficiently
> > > > > >   distant from that of the high-water-mark alarm
> > > > > >
> > > > > > Cheers,
> > > > > > Eoghan
> > > > > >
> > > > > > ----- Original Message -----
> > > > > > > Hi,
> > > > > > >
> > > > > > > Some update... I yesterday added "repeat_actions" : true
> > -definition
> > > > to
> > > > > > > OS::Ceilometer::Alarm resources in the Heat template:
> > > > > > >
> > > > > > > "CPUAlarmHigh": {
> > > > > > > "Type": "OS::Ceilometer::Alarm",
> > > > > > > "Properties": {
> > > > > > > "description": "Scale-up if CPU is greater than 90% for 30
> > seconds",
> > > > > > > "meter_name": "cpu_util",
> > > > > > > "statistic": "avg",
> > > > > > > "period": "30",
> > > > > > > "evaluation_periods": "1",
> > > > > > > "threshold": "90",
> > > > > > > "alarm_actions":
> > > > > > > [ {"Fn::GetAtt": ["ScaleUpPolicy", "AlarmUrl"]} ],
> > > > > > > "matching_metadata":
> > > > > > > {"metadata.user_metadata.server_group": "Group_A" },
> > > > > > > "comparison_operator": "gt",
> > > > > > > "repeat_actions" : true
> > > > > > > }
> > > > > > > },
> > > > > > >
> > > > > > > "CPUAlarmLow": {
> > > > > > > "Type": "OS::Ceilometer::Alarm",
> > > > > > > "Properties": {
> > > > > > > "description": "Scale-down if CPU is less than 50% for 30
> > seconds",
> > > > > > > "meter_name": "cpu_util",
> > > > > > > "statistic": "avg",
> > > > > > > "period": "30",
> > > > > > > "evaluation_periods": "1",
> > > > > > > "threshold": "50",
> > > > > > > "alarm_actions":
> > > > > > > [ {"Fn::GetAtt": ["ScaleDownPolicy", "AlarmUrl"]} ],
> > > > > > > "matching_metadata":
> > > > > > > {"metadata.user_metadata.server_group": "Group_A" },
> > > > > > > "comparison_operator": "lt",
> > > > > > > "repeat_actions" : true
> > > > > > > }
> > > > > > > }
> > > > > > >
> > > > > > > ...and everything seemed to work fine. But now I just created a
> > stack
> > > > > > again
> > > > > > > and generated some load inside the first VM started. Scaling up
> > > > occurred,
> > > > > > > but after that the system is now continuously scaling up and down
> > > > the VMs
> > > > > > > even the load situation doesn't change. Seems to be the
> > > > "repeat_actions"
> > > > > > > definitions didn't help after all...
> > > > > > >
> > > > > > > Br,
> > > > > > > -Juha
> > > > > > >
> > > > > > >
> > > > > > > On 25 February 2014 00:27, Steven Dake < sdake at redhat.com >
> > wrote:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Juha,
> > > > > > >
> > > > > > > Copying Angus so he sees. He wrote a big majority of the
> > ceilometer +
> > > > > > heat
> > > > > > > integration and might have a better idea of the details of the
> > > > problem
> > > > > > you
> > > > > > > face.
> > > > > > >
> > > > > > >
> > > > > > > On 02/24/2014 01:27 AM, Juha Tynninen wrote:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > I'm having some problems concerning auto scaling feature.
> > > > > > > Any ideas?
> > > > > > >
> > > > > > > First scaling up and down is working just fine. But then when
> > tested
> > > > > > later on
> > > > > > > scaling down/up is no longer working properly.
> > > > > > > Scaling down may occur even it shouldn't or scaling up doesn't
> > occur
> > > > > > even it
> > > > > > > should. When in this situation I remove all the
> > > > > > > received metric data from the DB, auto scaling starts to work
> > again.
> > > > > > >
> > > > > > > Ceilometer is configured to use Mongo and the auto scaling is
> > based
> > > > on
> > > > > > the
> > > > > > > cpu_util metrics.
> > > > > > >
> > > > > > > Related configurations:
> > > > > > > -----------------------
> > > > > > > /etc/ceilometer/pipeline.yaml on compute nodes:
> > > > > > >
> > > > > > > name: cpu_pipeline
> > > > > > > interval: 15
> > > > > > >
> > > > > > > /etc/ceilometer/ceilometer.conf on controller:
> > > > > > > evaluation_interval=15
> > > > > > >
> > > > > > > Heat template used:
> > > > > > > -------------------
> > > > > > > "Resources" : {
> > > > > > >
> > > > > > > "Group_A" : {
> > > > > > > "Type" : "AWS::AutoScaling::AutoScalingGroup",
> > > > > > > "Properties" : {
> > > > > > > "AvailabilityZones" : { "Fn::GetAZs" : ""},
> > > > > > > "LaunchConfigurationName" : { "Ref" : "Group_A_Config" },
> > > > > > > "MinSize" : "1",
> > > > > > > "MaxSize" : "3",
> > > > > > > "Tags" : [
> > > > > > > { "Key" : "metering.server_group", "Value" : "Group_A" },
> > > > > > > { "Key" : "custom_metadata", "Value" : "test" }
> > > > > > > ],
> > > > > > > "VPCZoneIdentifier" : [ { "Ref" : "PrivateSubnetId" } ]
> > > > > > > }
> > > > > > > },
> > > > > > >
> > > > > > > "Group_A_Config" : {
> > > > > > > "Type" : "AWS::AutoScaling::LaunchConfiguration",
> > > > > > > "Properties": {
> > > > > > > "ImageId" : { "Ref" : "ImageId" },
> > > > > > > "InstanceType" : { "Ref" : "InstanceType" },
> > > > > > > "KeyName" : { "Ref" : "KeyName" }
> > > > > > > }
> > > > > > > },
> > > > > > >
> > > > > > > "ScaleUpPolicy" : {
> > > > > > > "Type" : "AWS::AutoScaling::ScalingPolicy",
> > > > > > > "Properties" : {
> > > > > > > "AdjustmentType" : "ChangeInCapacity",
> > > > > > > "AutoScalingGroupName" : { "Ref" : "Group_A" },
> > > > > > > "Cooldown" : "20",
> > > > > > > "ScalingAdjustment" : "1"
> > > > > > > }
> > > > > > > },
> > > > > > >
> > > > > > > "ScaleDownPolicy" : {
> > > > > > > "Type" : "AWS::AutoScaling::ScalingPolicy",
> > > > > > > "Properties" : {
> > > > > > > "AdjustmentType" : "ChangeInCapacity",
> > > > > > > "AutoScalingGroupName" : { "Ref" : "Group_A" },
> > > > > > > "Cooldown" : "20",
> > > > > > > "ScalingAdjustment" : "-1"
> > > > > > > }
> > > > > > > },
> > > > > > >
> > > > > > > "CPUAlarmHigh": {
> > > > > > > "Type": "OS::Ceilometer::Alarm",
> > > > > > > "Properties": {
> > > > > > > "description": "Scale-up if CPU is greater than 90% for 20
> > seconds",
> > > > > > > "meter_name": "cpu_util",
> > > > > > > "statistic": "avg",
> > > > > > > "period": "20",
> > > > > > > "evaluation_periods": "1",
> > > > > > > "threshold": "90",
> > > > > > > "alarm_actions":
> > > > > > > [ {"Fn::GetAtt": ["ScaleUpPolicy", "AlarmUrl"]} ],
> > > > > > > "matching_metadata":
> > > > > > > {"metadata.user_metadata.server_group": "Group_A" },
> > > > > > > "comparison_operator": "gt"
> > > > > > > }
> > > > > > > },
> > > > > > >
> > > > > > > "CPUAlarmLow": {
> > > > > > > "Type": "OS::Ceilometer::Alarm",
> > > > > > > "Properties": {
> > > > > > > "description": "Scale-down if CPU is less than 50% for 20
> > seconds",
> > > > > > > "meter_name": "cpu_util",
> > > > > > > "statistic": "avg",
> > > > > > > "period": "20",
> > > > > > > "evaluation_periods": "1",
> > > > > > > "threshold": "50",
> > > > > > > "alarm_actions":
> > > > > > > [ {"Fn::GetAtt": ["ScaleDownPolicy", "AlarmUrl"]} ],
> > > > > > > "matching_metadata":
> > > > > > > {"metadata.user_metadata.server_group": "Group_A" },
> > > > > > > "comparison_operator": "lt"
> > > > > > > }
> > > > > > >
> > > > > > > In ceilometer logs I can see the following kind of warnings:
> > > > > > >
> > > > > > > <44>Feb 24 08:41:08 node-16
> > > > > > > ceilometer-ceilometer.collector.dispatcher.database WARNING:
> > message
> > > > > > > signature invalid, discarding message: {u'counter_name':
> > > > > > > u'instance.scheduled', u'user_id': None, u'message_signature':
> > > > > > >
> > u'd1b49ddf004edc5b7a8dc9405b42a71f2ae975d04c25838c3dc0ea0e6f6e4edd',
> > > > > > > u'timestamp': u'2014-02-24 08:41:08.334580', u'resource_id':
> > > > > > > u'48c815ab-01c9-4ac8-9096-ac171976598c', u'message_id':
> > > > > > > u'67e611e4-9d2f-11e3-81f1-080027e519cb', u'source': u'openstack',
> > > > > > > u'counter_unit': u'instance', u'counter_volume': 1,
> > u'project_id':
> > > > > > > u'efcca4ba425c4beda73eb31a54df931a', u'resource_metadata':
> > > > > > {u'instance_id':
> > > > > > > u'48c815ab-01c9-4ac8-9096-ac171976598c', u'weighted_host':
> > {u'host':
> > > > > > > u'node-18', u'weight': 3818.0}, u'host': u'scheduler.node-16',
> > > > > > > u'request_spec': {u'num_instances': 1, u'block_device_mapping':
> > > > > > > [{u'instance_uuid': u'48c815ab-01c9-4ac8-9096-ac171976598c',
> > > > > > > u'guest_format': None, u'boot_index': 0,
> > u'delete_on_termination':
> > > > True,
> > > > > > > u'no_device': None, u'connection_info': None, u'volume_id': None,
> > > > > > > u'device_name': None, u'disk_bus': None, u'image_id':
> > > > > > > u'11848cbf-a428-4dfb-8818-2f0a981f540b', u'source_type':
> > u'image',
> > > > > > > u'device_type': u'disk', u'snapshot_id': None,
> > u'destination_type':
> > > > > > > u'local', u'volume_size': None}], u'image': {u'status':
> > u'active',
> > > > > > u'name':
> > > > > > > u'cirrosImg', u'deleted': False, u'container_format': u'bare',
> > > > > > > u'created_at': u'2014-02-12T08:46:04.000000', u'disk_format':
> > > > u'qcow2',
> > > > > > > u'updated_at': u'2014-02-12T08:46:04.000000', u'properties': {},
> > > > > > > u'min_disk': 0, u'min_ram': 0, u'checksum':
> > > > > > > u'50bdc35edb03a38d91b1b071afb20a3c', u'owner':
> > > > > > > u'efcca4ba425c4beda73eb31a54df931a', u'is_public': True,
> > > > u'deleted_at':
> > > > > > > None, u'id': u'11848cbf-a428-4dfb-8818-2f0a981f540b', u'size':
> > > > 9761280},
> > > > > > > u'instance_type': {u'root_gb': 1, u'name': u'm1.tiny',
> > > > u'ephemeral_gb':
> > > > > > 0,
> > > > > > > u'memory_mb': 512, u'vcpus': 1, u'extra_specs': {}, u'swap': 0,
> > > > > > > u'rxtx_factor': 1.0, u'flavorid': u'1', u'vcpu_weight': None,
> > u'id':
> > > > 2},
> > > > > > > u'instance_properties': {u'vm_state': u'building',
> > > > u'availability_zone':
> > > > > > > None, u'terminated_at': None, u'ephemeral_gb': 0,
> > > > u'instance_type_id': 2,
> > > > > > > u'user_data':
> > > > > > u'Q29udGVudC1UeXBlOiBtdWx0aXBhcnQvbWl4ZWQ7IGJvdW5kYXJ5PSI9PT0
> > > > > > > ...
> > > > > > > , u'cleaned': False, u'vm_mode': None, u'deleted_at': None,
> > > > > > > u'reservation_id': u'r-l91mh33v', u'id': 274, u'security_groups':
> > > > > > > {u'objects': []}, u'disable_terminate': False,
> > u'root_device_name':
> > > > None,
> > > > > > > u'display_name':
> > u'tyky-Group_A-55cklit7nvbq-Group_A-2-yis32na5m7ey',
> > > > > > > u'uuid': u'48c815ab-01c9-4ac8-9096-ac171976598c',
> > > > u'default_swap_device':
> > > > > > > None, u'info_cache': {u'instance_uuid':
> > > > > > > u'48c815ab-01c9-4ac8-9096-ac171976598c', u'network_info': []},
> > > > > > u'hostname':
> > > > > > > u'tyky-group-a-55cklit7nvbq-group-a-2-yis32na5m7ey',
> > u'launched_on':
> > > > > > None,
> > > > > > > u'display_description':
> > > > > > u'tyky-Group_A-55cklit7nvbq-Group_A-2-yis32na5m7ey',
> > > > > > > u'key_data': u'ssh-rsa
> > > > > > >
> > > > > >
> > > >
> > AAAAB3NzaC1yc2EAAAADAQABAAABAQC39hmz8e40Xv/+QKkLyRA7j02RfIG61cr1j41RftnkOF3ZbwBzi7qibsOA3gC9Ln05YbB6z2/iUnQzxQsoOpmlnXuv2O296utY2ZCTKhdFSzn2Ot7l635zEXkivMc97wz4bITtaBTjX3nV6sXOfevdTIOJeC11SqxmfNRRzXcz9fRv6kLjz7IrA0tvRTp2xDVtFEj+vFLWaXc3TcUSygxiSLeAuNkH1rZ9jVuHXXvzb/e7navrGyJec2P86AQg2TUk77MhLjPcbyKiJJK0DhK6zOkZUWXtgIVQx7+gO/Xs2QgQHcw+VdzRzpJK+/EOzUOU8IDWNnyfaJEnQEoX2oMj
> > > > > > > Generated by Nova\n', u'deleted': False, u'config_drive': u'',
> > > > > > > u'power_state': 0, u'default_ephemeral_device': None,
> > u'progress': 0,
> > > > > > > u'project_id': u'efcca4ba425c4beda73eb31a54df931a',
> > u'launched_at':
> > > > None,
> > > > > > > u'scheduled_at': None, u'node': None, u'ramdisk_id': u'',
> > > > > > u'access_ip_v6':
> > > > > > > None, u'access_ip_v4': None, u'kernel_id': u'', u'key_name':
> > > > u'heat_key',
> > > > > > > u'updated_at': None, u'host': None, u'user_id':
> > > > > > > u'ef4e983291ef4ad1b88eb1f776bd52b6', u'system_metadata':
> > > > > > > {u'instance_type_memory_mb': 512, u'instance_type_swap': 0,
> > > > > > > u'instance_type_vcpu_weight': None, u'instance_type_root_gb': 1,
> > > > > > > u'instance_type_name': u'm1.tiny', u'instance_type_id': 2,
> > > > > > > u'instance_type_ephemeral_gb': 0, u'instance_type_rxtx_factor':
> > 1.0,
> > > > > > > u'image_disk_format': u'qcow2', u'instance_type_flavorid': u'1',
> > > > > > > u'instance_type_vcpus': 1, u'image_container_format': u'bare',
> > > > > > > u'image_min_ram': 0, u'image_min_disk': 1,
> > u'image_base_image_ref':
> > > > > > > u'11848cbf-a428-4dfb-8818-2f0a981f540b'}, u'task_state':
> > > > u'scheduling',
> > > > > > > u'shutdown_terminate': False, u'cell_name': None, u'root_gb': 1,
> > > > > > u'locked':
> > > > > > > False, u'name': u'instance-00000112', u'created_at':
> > > > > > > u'2014-02-24T08:41:08.257534', u'locked_by': None,
> > u'launch_index':
> > > > 0,
> > > > > > > u'memory_mb': 512, u'vcpus': 1, u'image_ref':
> > > > > > > u'11848cbf-a428-4dfb-8818-2f0a981f540b', u'architecture': None,
> > > > > > > u'auto_disk_config': False, u'os_type': None, u'metadata':
> > > > > > > {u'metering.server_group': u'Group_A', u'AutoScalingGroupName':
> > > > > > > u'tyky-Group_A-55cklit7nvbq', u'custom_metadata': u'test'}},
> > > > > > > u'security_group': [u'default'], u'instance_uuids':
> > > > > > > [u'48c815ab-01c9-4ac8-9096-ac171976598c']}, u'event_type':
> > > > > > > u'scheduler.run_instance.scheduled'}, u'counter_type': u'delta'}
> > > > > > >
> > > > > > > Also the following warnings/errors can be seen but they seem to
> > occur
> > > > > > when
> > > > > > > auto scaling is properly working and have no negative effects as
> > > > such:
> > > > > > >
> > > > > > > <44>Feb 24 08:43:08 node-16
> > > > > > > <U+FEFF>ceilometer-ceilometer.transformer.conversions WARNING:
> > > > dropping
> > > > > > > sample with no predecessor: <ceilometer.sample.Sample object at
> > > > > > 0x3774fd0>
> > > > > > > <44>Feb 24 08:43:08 node-16 ceilometer-ceilometer.publisher.rpc
> > > > AUDIT:
> > > > > > > Publishing 1 samples on metering
> > > > > > > <44>Feb 24 08:43:08 node-16 ceilometer-ceilometer.publisher.rpc
> > > > AUDIT:
> > > > > > > Publishing 1 samples on metering
> > > > > > > <44>Feb 24 08:43:08 node-16 ceilometer-ceilometer.publisher.rpc
> > > > AUDIT:
> > > > > > > Publishing 1 samples on metering
> > > > > > > <44>Feb 24 08:43:08 node-16 ceilometer-ceilometer.publisher.rpc
> > > > AUDIT:
> > > > > > > Publishing 1 samples on metering
> > > > > > > <44>Feb 24 08:43:08 node-16 ceilometer-ceilometer.publisher.rpc
> > > > AUDIT:
> > > > > > > Publishing 1 samples on metering
> > > > > > > <44>Feb 24 08:43:08 node-16 ceilometer-ceilometer.publisher.rpc
> > > > AUDIT:
> > > > > > > Publishing 1 samples on metering
> > > > > > > <44>Feb 24 08:43:09 node-16 ceilometer-ceilometer.publisher.rpc
> > > > AUDIT:
> > > > > > > Publishing 1 samples on metering
> > > > > > > <43>Feb 24 08:43:09 node-16
> > > > > > > ceilometer-ceilometer.collector.dispatcher.database ERROR:
> > Failed to
> > > > > > record
> > > > > > > metering data: not okForStor
> > > > > > > age
> > > > > > > Traceback (most recent call last):
> > > > > > > File
> > > > > > >
> > > > > >
> > > >
> > "/usr/lib/python2.7/dist-packages/ceilometer/collector/dispatcher/database.py",
> > > > > > > line 65, in record_metering_data
> > > > > > > self.storage_conn.record_metering_data(meter)
> > > > > > > File
> > > > > >
> > "/usr/lib/python2.7/dist-packages/ceilometer/storage/impl_mongodb.py",
> > > > > > > line 417, in record_metering_data
> > > > > > > upsert=True,
> > > > > > > File "/usr/lib/python2.7/dist-packages/pymongo/collection.py",
> > line
> > > > 487,
> > > > > > in
> > > > > > > update
> > > > > > > check_keys, self.__uuid_subtype), safe)
> > > > > > > File "/usr/lib/python2.7/dist-packages/pymongo/mongo_client.py",
> > line
> > > > > > 969, in
> > > > > > > _send_message
> > > > > > > rv = self.__check_response_to_last_error(response)
> > > > > > > File "/usr/lib/python2.7/dist-packages/pymongo/mongo_client.py",
> > line
> > > > > > 911, in
> > > > > > > __check_response_to_last_error
> > > > > > > raise OperationFailure(details["err"], details["code"])
> > > > > > > OperationFailure: not okForStorage
> > > > > > >
> > > > > > > Br,
> > > > > > > -Juha
> > > > > > >
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > Mailing list:
> > > > > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> > > > > > > Post to     : openstack at lists.openstack.org Unsubscribe :
> > > > > > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > Mailing list:
> > > > > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> > > > > > > Post to     : openstack at lists.openstack.org
> > > > > > > Unsubscribe :
> > > > > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 

Open Stack

[Openstack] [Heat/Ceilometer/Havana]: Auto scaling no longer occurring after some time

OpenStack

Community

Documentation

Branding & Legal