<div dir="ltr">Hi,<div><br></div><div>Some update... I yesterday added "repeat_actions" : true -definition to OS::Ceilometer::Alarm resources in the Heat template:</div><div><br></div><div><div>        "CPUAlarmHigh": {</div>
<div>            "Type": "OS::Ceilometer::Alarm",</div><div>            "Properties": {</div><div>                "description": "Scale-up if CPU is greater than 90% for 30 seconds",</div>
<div>                "meter_name": "cpu_util",</div><div>                "statistic": "avg",</div><div>                "period": "30",</div><div>                "evaluation_periods": "1",</div>
<div>                "threshold": "90",</div><div>                "alarm_actions":</div><div>                    [ {"Fn::GetAtt": ["ScaleUpPolicy", "AlarmUrl"]} ],</div>
<div>                "matching_metadata":</div><div>                    {"metadata.user_metadata.server_group": "Group_A" },</div><div>                "comparison_operator": "gt",</div>
<div>                "repeat_actions" : true</div><div>            }</div><div>        },</div><div><br></div><div>        "CPUAlarmLow": {</div><div>            "Type": "OS::Ceilometer::Alarm",</div>
<div>            "Properties": {</div><div>                "description": "Scale-down if CPU is less than 50% for 30 seconds",</div><div>                "meter_name": "cpu_util",</div>
<div>                "statistic": "avg",</div><div>                "period": "30",</div><div>                "evaluation_periods": "1",</div><div>                "threshold": "50",</div>
<div>                "alarm_actions":</div><div>                    [ {"Fn::GetAtt": ["ScaleDownPolicy", "AlarmUrl"]} ],</div><div>                "matching_metadata":</div>
<div>                    {"metadata.user_metadata.server_group": "Group_A" },</div><div>                "comparison_operator": "lt",</div><div>                "repeat_actions" : true</div>
<div>            }</div><div>        }</div></div><div><br></div><div>...and everything seemed to work fine. But now I just created a stack again and generated some load inside the first VM started. Scaling up occurred, but after that the system is now continuously scaling up and down the VMs even the load situation doesn't change. Seems to be the "repeat_actions" definitions didn't help after all...</div>
<div><br></div><div>Br,</div><div>-Juha</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On 25 February 2014 00:27, Steven Dake <span dir="ltr"><<a href="mailto:sdake@redhat.com" target="_blank">sdake@redhat.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div text="#000000" bgcolor="#FFFFFF">
    <div>Juha,<br>
      <br>
      Copying Angus so he sees.  He wrote a big majority of the
      ceilometer + heat integration and might have a better idea of the
      details of the problem you face.<div><div class="h5"><br>
      <br>
      On 02/24/2014 01:27 AM, Juha Tynninen wrote:<br>
    </div></div></div>
    <blockquote type="cite"><div><div class="h5">
      <div dir="ltr">
        <div>Hi,</div>
        <div><br>
        </div>
        <div>I'm having some problems concerning auto scaling feature.</div>
        <div>Any ideas?</div>
        <div><br>
        </div>
        <div>First scaling up and down is working just fine. But then
          when tested later on scaling down/up is no longer working
          properly. </div>
        <div>Scaling down may occur even it shouldn't or scaling up
          doesn't occur even it should. When in this situation I remove
          all the </div>
        <div>received metric data from the DB, auto scaling starts to
          work again.</div>
        <div><br>
        </div>
        <div>Ceilometer is configured to use Mongo and the auto scaling
          is based on the cpu_util metrics.</div>
        <div><br>
        </div>
        <div>Related configurations:</div>
        <div>-----------------------</div>
        <div>/etc/ceilometer/pipeline.yaml on compute nodes:</div>
        <div><br>
        </div>
        <div>name: cpu_pipeline</div>
        <div>interval: 15</div>
        <div><br>
        </div>
        <div>/etc/ceilometer/ceilometer.conf on controller:</div>
        <div>evaluation_interval=15</div>
        <div><br>
        </div>
        <div>Heat template used:</div>
        <div>
          -------------------</div>
        <div>   "Resources" : {</div>
        <div><br>
        </div>
        <div>        "Group_A" : {</div>
        <div>            "Type" : "AWS::AutoScaling::AutoScalingGroup",</div>
        <div>            "Properties" : {</div>
        <div>                "AvailabilityZones" : { "Fn::GetAZs" : ""},</div>
        <div>                "LaunchConfigurationName" : { "Ref" :
          "Group_A_Config" },</div>
        <div>                "MinSize" : "1",</div>
        <div>                "MaxSize" : "3",</div>
        <div>                "Tags" : [ </div>
        <div>                  { "Key" : "metering.server_group",
          "Value" : "Group_A" },</div>
        <div>                  { "Key" : "custom_metadata", "Value" :
          "test" } </div>
        <div>                ],</div>
        <div>                "VPCZoneIdentifier" : [ { "Ref" :
          "PrivateSubnetId" } ] <br>
        </div>
        <div>            }</div>
        <div>        },</div>
        <div><br>
        </div>
        <div>        "Group_A_Config" : {</div>
        <div>            "Type" :
          "AWS::AutoScaling::LaunchConfiguration",</div>
        <div>            "Properties": {</div>
        <div>                "ImageId" : { "Ref" : "ImageId" },</div>
        <div>                "InstanceType" : { "Ref" : "InstanceType"
          },</div>
        <div>                "KeyName" : { "Ref" : "KeyName" }</div>
        <div>            }</div>
        <div>        },</div>
        <div><br>
        </div>
        <div>        "ScaleUpPolicy" : {</div>
        <div>            "Type" : "AWS::AutoScaling::ScalingPolicy",</div>
        <div>            "Properties" : {</div>
        <div>                "AdjustmentType" : "ChangeInCapacity",</div>
        <div>                "AutoScalingGroupName" : { "Ref" :
          "Group_A" },</div>
        <div>                "Cooldown" : "20",</div>
        <div>                "ScalingAdjustment" : "1"</div>
        <div>            }</div>
        <div>        },</div>
        <div><br>
        </div>
        <div>        "ScaleDownPolicy" : {</div>
        <div>            "Type" : "AWS::AutoScaling::ScalingPolicy",</div>
        <div>            "Properties" : {</div>
        <div>                "AdjustmentType" : "ChangeInCapacity",</div>
        <div>                "AutoScalingGroupName" : { "Ref" :
          "Group_A" },</div>
        <div>                "Cooldown" : "20",</div>
        <div>                "ScalingAdjustment" : "-1"</div>
        <div>            }</div>
        <div>        },</div>
        <div><br>
        </div>
        <div><span style="white-space:pre-wrap"> </span>"CPUAlarmHigh":
          {</div>
        <div>            "Type": "OS::Ceilometer::Alarm",</div>
        <div>            "Properties": {</div>
        <div>                "description": "Scale-up if CPU is greater
          than 90% for 20 seconds",</div>
        <div>                "meter_name": "cpu_util",</div>
        <div>                "statistic": "avg",</div>
        <div>                "period": "20",</div>
        <div>                "evaluation_periods": "1",</div>
        <div>                "threshold": "90",</div>
        <div>                "alarm_actions":</div>
        <div>                    [ {"Fn::GetAtt": ["ScaleUpPolicy",
          "AlarmUrl"]} ],</div>
        <div>                "matching_metadata":</div>
        <div>                    {"metadata.user_metadata.server_group":
          "Group_A" },</div>
        <div>                "comparison_operator": "gt"</div>
        <div>            }</div>
        <div>        },</div>
        <div><br>
        </div>
        <div>        "CPUAlarmLow": {</div>
        <div>            "Type": "OS::Ceilometer::Alarm",</div>
        <div>            "Properties": {</div>
        <div>                "description": "Scale-down if CPU is less
          than 50% for 20 seconds",</div>
        <div>                "meter_name": "cpu_util",</div>
        <div>                "statistic": "avg",</div>
        <div>                "period": "20",</div>
        <div>                "evaluation_periods": "1",</div>
        <div>                "threshold": "50",</div>
        <div>                "alarm_actions":</div>
        <div>                    [ {"Fn::GetAtt": ["ScaleDownPolicy",
          "AlarmUrl"]} ],</div>
        <div>                "matching_metadata":</div>
        <div>                    {"metadata.user_metadata.server_group":
          "Group_A" },</div>
        <div>                "comparison_operator": "lt"</div>
        <div>        }</div>
        <div><br>
        </div>
        <div>In ceilometer logs I can see the following kind of
          warnings:</div>
        <div><br>
        </div>
        <div><44>Feb 24 08:41:08 node-16
          ceilometer-ceilometer.collector.dispatcher.database WARNING:
          message signature invalid, discarding message:
          {u'counter_name': u'instance.scheduled', u'user_id': None,
          u'message_signature':
          u'd1b49ddf004edc5b7a8dc9405b42a71f2ae975d04c25838c3dc0ea0e6f6e4edd',
          u'timestamp': u'2014-02-24 08:41:08.334580', u'resource_id':
          u'48c815ab-01c9-4ac8-9096-ac171976598c', u'message_id':
          u'67e611e4-9d2f-11e3-81f1-080027e519cb', u'source':
          u'openstack', u'counter_unit': u'instance', u'counter_volume':
          1, u'project_id': u'efcca4ba425c4beda73eb31a54df931a',
          u'resource_metadata': {u'instance_id':
          u'48c815ab-01c9-4ac8-9096-ac171976598c', u'weighted_host':
          {u'host': u'node-18', u'weight': 3818.0}, u'host':
          u'scheduler.node-16', u'request_spec': {u'num_instances': 1,
          u'block_device_mapping': [{u'instance_uuid':
          u'48c815ab-01c9-4ac8-9096-ac171976598c', u'guest_format':
          None, u'boot_index': 0, u'delete_on_termination': True,
          u'no_device': None, u'connection_info': None, u'volume_id':
          None, u'device_name': None, u'disk_bus': None, u'image_id':
          u'11848cbf-a428-4dfb-8818-2f0a981f540b', u'source_type':
          u'image', u'device_type': u'disk', u'snapshot_id': None,
          u'destination_type': u'local', u'volume_size': None}],
          u'image': {u'status': u'active', u'name': u'cirrosImg',
          u'deleted': False, u'container_format': u'bare',
          u'created_at': u'2014-02-12T08:46:04.000000', u'disk_format':
          u'qcow2', u'updated_at': u'2014-02-12T08:46:04.000000',
          u'properties': {}, u'min_disk': 0, u'min_ram': 0, u'checksum':
          u'50bdc35edb03a38d91b1b071afb20a3c', u'owner':
          u'efcca4ba425c4beda73eb31a54df931a', u'is_public': True,
          u'deleted_at': None, u'id':
          u'11848cbf-a428-4dfb-8818-2f0a981f540b', u'size': 9761280},
          u'instance_type': {u'root_gb': 1, u'name': u'm1.tiny',
          u'ephemeral_gb': 0, u'memory_mb': 512, u'vcpus': 1,
          u'extra_specs': {}, u'swap': 0, u'rxtx_factor': 1.0,
          u'flavorid': u'1', u'vcpu_weight': None, u'id': 2},
          u'instance_properties': {u'vm_state': u'building',
          u'availability_zone': None, u'terminated_at': None,
          u'ephemeral_gb': 0, u'instance_type_id': 2, u'user_data':
           u'Q29udGVudC1UeXBlOiBtdWx0aXBhcnQvbWl4ZWQ7IGJvdW5kYXJ5PSI9PT0</div>
        <div>...</div>
        <div>, u'cleaned': False, u'vm_mode': None, u'deleted_at': None,
          u'reservation_id': u'r-l91mh33v', u'id': 274,
          u'security_groups': {u'objects': []}, u'disable_terminate':
          False, u'root_device_name': None, u'display_name':
          u'tyky-Group_A-55cklit7nvbq-Group_A-2-yis32na5m7ey', u'uuid':
          u'48c815ab-01c9-4ac8-9096-ac171976598c',
          u'default_swap_device': None, u'info_cache':
          {u'instance_uuid': u'48c815ab-01c9-4ac8-9096-ac171976598c',
          u'network_info': []}, u'hostname':
          u'tyky-group-a-55cklit7nvbq-group-a-2-yis32na5m7ey',
          u'launched_on': None, u'display_description':
          u'tyky-Group_A-55cklit7nvbq-Group_A-2-yis32na5m7ey',
          u'key_data': u'ssh-rsa
          AAAAB3NzaC1yc2EAAAADAQABAAABAQC39hmz8e40Xv/+QKkLyRA7j02RfIG61cr1j41RftnkOF3ZbwBzi7qibsOA3gC9Ln05YbB6z2/iUnQzxQsoOpmlnXuv2O296utY2ZCTKhdFSzn2Ot7l635zEXkivMc97wz4bITtaBTjX3nV6sXOfevdTIOJeC11SqxmfNRRzXcz9fRv6kLjz7IrA0tvRTp2xDVtFEj+vFLWaXc3TcUSygxiSLeAuNkH1rZ9jVuHXXvzb/e7navrGyJec2P86AQg2TUk77MhLjPcbyKiJJK0DhK6zOkZUWXtgIVQx7+gO/Xs2QgQHcw+VdzRzpJK+/EOzUOU8IDWNnyfaJEnQEoX2oMj
          Generated by Nova\n', u'deleted': False, u'config_drive': u'',
          u'power_state': 0, u'default_ephemeral_device': None,
          u'progress': 0, u'project_id':
          u'efcca4ba425c4beda73eb31a54df931a', u'launched_at': None,
          u'scheduled_at': None, u'node': None, u'ramdisk_id': u'',
          u'access_ip_v6': None, u'access_ip_v4': None, u'kernel_id':
          u'', u'key_name': u'heat_key', u'updated_at': None, u'host':
          None, u'user_id': u'ef4e983291ef4ad1b88eb1f776bd52b6',
          u'system_metadata': {u'instance_type_memory_mb': 512,
          u'instance_type_swap': 0, u'instance_type_vcpu_weight': None,
          u'instance_type_root_gb': 1, u'instance_type_name':
          u'm1.tiny', u'instance_type_id': 2,
          u'instance_type_ephemeral_gb': 0,
          u'instance_type_rxtx_factor': 1.0, u'image_disk_format':
          u'qcow2', u'instance_type_flavorid': u'1',
          u'instance_type_vcpus': 1, u'image_container_format': u'bare',
          u'image_min_ram': 0, u'image_min_disk': 1,
          u'image_base_image_ref':
          u'11848cbf-a428-4dfb-8818-2f0a981f540b'}, u'task_state':
          u'scheduling', u'shutdown_terminate': False, u'cell_name':
          None, u'root_gb': 1, u'locked': False, u'name':
          u'instance-00000112', u'created_at':
          u'2014-02-24T08:41:08.257534', u'locked_by': None,
          u'launch_index': 0, u'memory_mb': 512, u'vcpus': 1,
          u'image_ref': u'11848cbf-a428-4dfb-8818-2f0a981f540b',
          u'architecture': None, u'auto_disk_config': False, u'os_type':
          None, u'metadata': {u'metering.server_group': u'Group_A',
          u'AutoScalingGroupName': u'tyky-Group_A-55cklit7nvbq',
          u'custom_metadata': u'test'}}, u'security_group':
          [u'default'], u'instance_uuids':
          [u'48c815ab-01c9-4ac8-9096-ac171976598c']}, u'event_type':
          u'scheduler.run_instance.scheduled'}, u'counter_type':
          u'delta'}</div>
        <div><br>
        </div>
        <div>Also the following warnings/errors can be seen but they
          seem to occur when auto scaling is properly working and have
          no negative effects as such:</div>
        <div><br>
        </div>
        <div><44>Feb 24 08:43:08 node-16
          <U+FEFF>ceilometer-ceilometer.transformer.conversions
          WARNING: dropping sample with no predecessor:
          <ceilometer.sample.Sample object at 0x3774fd0></div>
        <div><44>Feb 24 08:43:08 node-16
          ceilometer-ceilometer.publisher.rpc AUDIT: Publishing 1
          samples on metering</div>
        <div><44>Feb 24 08:43:08 node-16
          ceilometer-ceilometer.publisher.rpc AUDIT: Publishing 1
          samples on metering</div>
        <div><44>Feb 24 08:43:08 node-16
          ceilometer-ceilometer.publisher.rpc AUDIT: Publishing 1
          samples on metering</div>
        <div><44>Feb 24 08:43:08 node-16
          ceilometer-ceilometer.publisher.rpc AUDIT: Publishing 1
          samples on metering</div>
        <div><44>Feb 24 08:43:08 node-16
          ceilometer-ceilometer.publisher.rpc AUDIT: Publishing 1
          samples on metering</div>
        <div><44>Feb 24 08:43:08 node-16
          ceilometer-ceilometer.publisher.rpc AUDIT: Publishing 1
          samples on metering</div>
        <div><44>Feb 24 08:43:09 node-16
          ceilometer-ceilometer.publisher.rpc AUDIT: Publishing 1
          samples on metering</div>
        <div><43>Feb 24 08:43:09 node-16
          ceilometer-ceilometer.collector.dispatcher.database ERROR:
          Failed to record metering data: not okForStor</div>
        <div>age</div>
        <div>Traceback (most recent call last):</div>
        <div>  File
          "/usr/lib/python2.7/dist-packages/ceilometer/collector/dispatcher/database.py",
          line 65, in record_metering_data</div>
        <div>    self.storage_conn.record_metering_data(meter)</div>
        <div>  File
          "/usr/lib/python2.7/dist-packages/ceilometer/storage/impl_mongodb.py",
          line 417, in record_metering_data</div>
        <div>    upsert=True,</div>
        <div>  File
          "/usr/lib/python2.7/dist-packages/pymongo/collection.py", line
          487, in update</div>
        <div>    check_keys, self.__uuid_subtype), safe)</div>
        <div>  File
          "/usr/lib/python2.7/dist-packages/pymongo/mongo_client.py",
          line 969, in _send_message</div>
        <div>    rv = self.__check_response_to_last_error(response)</div>
        <div>  File
          "/usr/lib/python2.7/dist-packages/pymongo/mongo_client.py",
          line 911, in __check_response_to_last_error</div>
        <div>    raise OperationFailure(details["err"], details["code"])</div>
        <div>OperationFailure: not okForStorage</div>
        <div><br>
        </div>
        <div>Br,</div>
        <div>-Juha</div>
      </div>
      <br>
      <fieldset></fieldset>
      <br>
      </div></div><div class=""><pre>_______________________________________________
Mailing list: <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a>
Post to     : <a href="mailto:openstack@lists.openstack.org" target="_blank">openstack@lists.openstack.org</a>
Unsubscribe : <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a>
</pre>
    </div></blockquote>
    <br>
  </div>

</blockquote></div><br></div>