<div dir="ltr">Hi,<div><br></div><div>Some update... I yesterday added "repeat_actions" : true -definition to OS::Ceilometer::Alarm resources in the Heat template:</div><div><br></div><div><div> "CPUAlarmHigh": {</div>
<div> "Type": "OS::Ceilometer::Alarm",</div><div> "Properties": {</div><div> "description": "Scale-up if CPU is greater than 90% for 30 seconds",</div>
<div> "meter_name": "cpu_util",</div><div> "statistic": "avg",</div><div> "period": "30",</div><div> "evaluation_periods": "1",</div>
<div> "threshold": "90",</div><div> "alarm_actions":</div><div> [ {"Fn::GetAtt": ["ScaleUpPolicy", "AlarmUrl"]} ],</div>
<div> "matching_metadata":</div><div> {"metadata.user_metadata.server_group": "Group_A" },</div><div> "comparison_operator": "gt",</div>
<div> "repeat_actions" : true</div><div> }</div><div> },</div><div><br></div><div> "CPUAlarmLow": {</div><div> "Type": "OS::Ceilometer::Alarm",</div>
<div> "Properties": {</div><div> "description": "Scale-down if CPU is less than 50% for 30 seconds",</div><div> "meter_name": "cpu_util",</div>
<div> "statistic": "avg",</div><div> "period": "30",</div><div> "evaluation_periods": "1",</div><div> "threshold": "50",</div>
<div> "alarm_actions":</div><div> [ {"Fn::GetAtt": ["ScaleDownPolicy", "AlarmUrl"]} ],</div><div> "matching_metadata":</div>
<div> {"metadata.user_metadata.server_group": "Group_A" },</div><div> "comparison_operator": "lt",</div><div> "repeat_actions" : true</div>
<div> }</div><div> }</div></div><div><br></div><div>...and everything seemed to work fine. But now I just created a stack again and generated some load inside the first VM started. Scaling up occurred, but after that the system is now continuously scaling up and down the VMs even the load situation doesn't change. Seems to be the "repeat_actions" definitions didn't help after all...</div>
<div><br></div><div>Br,</div><div>-Juha</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On 25 February 2014 00:27, Steven Dake <span dir="ltr"><<a href="mailto:sdake@redhat.com" target="_blank">sdake@redhat.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div>Juha,<br>
<br>
Copying Angus so he sees. He wrote a big majority of the
ceilometer + heat integration and might have a better idea of the
details of the problem you face.<div><div class="h5"><br>
<br>
On 02/24/2014 01:27 AM, Juha Tynninen wrote:<br>
</div></div></div>
<blockquote type="cite"><div><div class="h5">
<div dir="ltr">
<div>Hi,</div>
<div><br>
</div>
<div>I'm having some problems concerning auto scaling feature.</div>
<div>Any ideas?</div>
<div><br>
</div>
<div>First scaling up and down is working just fine. But then
when tested later on scaling down/up is no longer working
properly. </div>
<div>Scaling down may occur even it shouldn't or scaling up
doesn't occur even it should. When in this situation I remove
all the </div>
<div>received metric data from the DB, auto scaling starts to
work again.</div>
<div><br>
</div>
<div>Ceilometer is configured to use Mongo and the auto scaling
is based on the cpu_util metrics.</div>
<div><br>
</div>
<div>Related configurations:</div>
<div>-----------------------</div>
<div>/etc/ceilometer/pipeline.yaml on compute nodes:</div>
<div><br>
</div>
<div>name: cpu_pipeline</div>
<div>interval: 15</div>
<div><br>
</div>
<div>/etc/ceilometer/ceilometer.conf on controller:</div>
<div>evaluation_interval=15</div>
<div><br>
</div>
<div>Heat template used:</div>
<div>
-------------------</div>
<div> "Resources" : {</div>
<div><br>
</div>
<div> "Group_A" : {</div>
<div> "Type" : "AWS::AutoScaling::AutoScalingGroup",</div>
<div> "Properties" : {</div>
<div> "AvailabilityZones" : { "Fn::GetAZs" : ""},</div>
<div> "LaunchConfigurationName" : { "Ref" :
"Group_A_Config" },</div>
<div> "MinSize" : "1",</div>
<div> "MaxSize" : "3",</div>
<div> "Tags" : [ </div>
<div> { "Key" : "metering.server_group",
"Value" : "Group_A" },</div>
<div> { "Key" : "custom_metadata", "Value" :
"test" } </div>
<div> ],</div>
<div> "VPCZoneIdentifier" : [ { "Ref" :
"PrivateSubnetId" } ] <br>
</div>
<div> }</div>
<div> },</div>
<div><br>
</div>
<div> "Group_A_Config" : {</div>
<div> "Type" :
"AWS::AutoScaling::LaunchConfiguration",</div>
<div> "Properties": {</div>
<div> "ImageId" : { "Ref" : "ImageId" },</div>
<div> "InstanceType" : { "Ref" : "InstanceType"
},</div>
<div> "KeyName" : { "Ref" : "KeyName" }</div>
<div> }</div>
<div> },</div>
<div><br>
</div>
<div> "ScaleUpPolicy" : {</div>
<div> "Type" : "AWS::AutoScaling::ScalingPolicy",</div>
<div> "Properties" : {</div>
<div> "AdjustmentType" : "ChangeInCapacity",</div>
<div> "AutoScalingGroupName" : { "Ref" :
"Group_A" },</div>
<div> "Cooldown" : "20",</div>
<div> "ScalingAdjustment" : "1"</div>
<div> }</div>
<div> },</div>
<div><br>
</div>
<div> "ScaleDownPolicy" : {</div>
<div> "Type" : "AWS::AutoScaling::ScalingPolicy",</div>
<div> "Properties" : {</div>
<div> "AdjustmentType" : "ChangeInCapacity",</div>
<div> "AutoScalingGroupName" : { "Ref" :
"Group_A" },</div>
<div> "Cooldown" : "20",</div>
<div> "ScalingAdjustment" : "-1"</div>
<div> }</div>
<div> },</div>
<div><br>
</div>
<div><span style="white-space:pre-wrap"> </span>"CPUAlarmHigh":
{</div>
<div> "Type": "OS::Ceilometer::Alarm",</div>
<div> "Properties": {</div>
<div> "description": "Scale-up if CPU is greater
than 90% for 20 seconds",</div>
<div> "meter_name": "cpu_util",</div>
<div> "statistic": "avg",</div>
<div> "period": "20",</div>
<div> "evaluation_periods": "1",</div>
<div> "threshold": "90",</div>
<div> "alarm_actions":</div>
<div> [ {"Fn::GetAtt": ["ScaleUpPolicy",
"AlarmUrl"]} ],</div>
<div> "matching_metadata":</div>
<div> {"metadata.user_metadata.server_group":
"Group_A" },</div>
<div> "comparison_operator": "gt"</div>
<div> }</div>
<div> },</div>
<div><br>
</div>
<div> "CPUAlarmLow": {</div>
<div> "Type": "OS::Ceilometer::Alarm",</div>
<div> "Properties": {</div>
<div> "description": "Scale-down if CPU is less
than 50% for 20 seconds",</div>
<div> "meter_name": "cpu_util",</div>
<div> "statistic": "avg",</div>
<div> "period": "20",</div>
<div> "evaluation_periods": "1",</div>
<div> "threshold": "50",</div>
<div> "alarm_actions":</div>
<div> [ {"Fn::GetAtt": ["ScaleDownPolicy",
"AlarmUrl"]} ],</div>
<div> "matching_metadata":</div>
<div> {"metadata.user_metadata.server_group":
"Group_A" },</div>
<div> "comparison_operator": "lt"</div>
<div> }</div>
<div><br>
</div>
<div>In ceilometer logs I can see the following kind of
warnings:</div>
<div><br>
</div>
<div><44>Feb 24 08:41:08 node-16
ceilometer-ceilometer.collector.dispatcher.database WARNING:
message signature invalid, discarding message:
{u'counter_name': u'instance.scheduled', u'user_id': None,
u'message_signature':
u'd1b49ddf004edc5b7a8dc9405b42a71f2ae975d04c25838c3dc0ea0e6f6e4edd',
u'timestamp': u'2014-02-24 08:41:08.334580', u'resource_id':
u'48c815ab-01c9-4ac8-9096-ac171976598c', u'message_id':
u'67e611e4-9d2f-11e3-81f1-080027e519cb', u'source':
u'openstack', u'counter_unit': u'instance', u'counter_volume':
1, u'project_id': u'efcca4ba425c4beda73eb31a54df931a',
u'resource_metadata': {u'instance_id':
u'48c815ab-01c9-4ac8-9096-ac171976598c', u'weighted_host':
{u'host': u'node-18', u'weight': 3818.0}, u'host':
u'scheduler.node-16', u'request_spec': {u'num_instances': 1,
u'block_device_mapping': [{u'instance_uuid':
u'48c815ab-01c9-4ac8-9096-ac171976598c', u'guest_format':
None, u'boot_index': 0, u'delete_on_termination': True,
u'no_device': None, u'connection_info': None, u'volume_id':
None, u'device_name': None, u'disk_bus': None, u'image_id':
u'11848cbf-a428-4dfb-8818-2f0a981f540b', u'source_type':
u'image', u'device_type': u'disk', u'snapshot_id': None,
u'destination_type': u'local', u'volume_size': None}],
u'image': {u'status': u'active', u'name': u'cirrosImg',
u'deleted': False, u'container_format': u'bare',
u'created_at': u'2014-02-12T08:46:04.000000', u'disk_format':
u'qcow2', u'updated_at': u'2014-02-12T08:46:04.000000',
u'properties': {}, u'min_disk': 0, u'min_ram': 0, u'checksum':
u'50bdc35edb03a38d91b1b071afb20a3c', u'owner':
u'efcca4ba425c4beda73eb31a54df931a', u'is_public': True,
u'deleted_at': None, u'id':
u'11848cbf-a428-4dfb-8818-2f0a981f540b', u'size': 9761280},
u'instance_type': {u'root_gb': 1, u'name': u'm1.tiny',
u'ephemeral_gb': 0, u'memory_mb': 512, u'vcpus': 1,
u'extra_specs': {}, u'swap': 0, u'rxtx_factor': 1.0,
u'flavorid': u'1', u'vcpu_weight': None, u'id': 2},
u'instance_properties': {u'vm_state': u'building',
u'availability_zone': None, u'terminated_at': None,
u'ephemeral_gb': 0, u'instance_type_id': 2, u'user_data':
u'Q29udGVudC1UeXBlOiBtdWx0aXBhcnQvbWl4ZWQ7IGJvdW5kYXJ5PSI9PT0</div>
<div>...</div>
<div>, u'cleaned': False, u'vm_mode': None, u'deleted_at': None,
u'reservation_id': u'r-l91mh33v', u'id': 274,
u'security_groups': {u'objects': []}, u'disable_terminate':
False, u'root_device_name': None, u'display_name':
u'tyky-Group_A-55cklit7nvbq-Group_A-2-yis32na5m7ey', u'uuid':
u'48c815ab-01c9-4ac8-9096-ac171976598c',
u'default_swap_device': None, u'info_cache':
{u'instance_uuid': u'48c815ab-01c9-4ac8-9096-ac171976598c',
u'network_info': []}, u'hostname':
u'tyky-group-a-55cklit7nvbq-group-a-2-yis32na5m7ey',
u'launched_on': None, u'display_description':
u'tyky-Group_A-55cklit7nvbq-Group_A-2-yis32na5m7ey',
u'key_data': u'ssh-rsa
AAAAB3NzaC1yc2EAAAADAQABAAABAQC39hmz8e40Xv/+QKkLyRA7j02RfIG61cr1j41RftnkOF3ZbwBzi7qibsOA3gC9Ln05YbB6z2/iUnQzxQsoOpmlnXuv2O296utY2ZCTKhdFSzn2Ot7l635zEXkivMc97wz4bITtaBTjX3nV6sXOfevdTIOJeC11SqxmfNRRzXcz9fRv6kLjz7IrA0tvRTp2xDVtFEj+vFLWaXc3TcUSygxiSLeAuNkH1rZ9jVuHXXvzb/e7navrGyJec2P86AQg2TUk77MhLjPcbyKiJJK0DhK6zOkZUWXtgIVQx7+gO/Xs2QgQHcw+VdzRzpJK+/EOzUOU8IDWNnyfaJEnQEoX2oMj
Generated by Nova\n', u'deleted': False, u'config_drive': u'',
u'power_state': 0, u'default_ephemeral_device': None,
u'progress': 0, u'project_id':
u'efcca4ba425c4beda73eb31a54df931a', u'launched_at': None,
u'scheduled_at': None, u'node': None, u'ramdisk_id': u'',
u'access_ip_v6': None, u'access_ip_v4': None, u'kernel_id':
u'', u'key_name': u'heat_key', u'updated_at': None, u'host':
None, u'user_id': u'ef4e983291ef4ad1b88eb1f776bd52b6',
u'system_metadata': {u'instance_type_memory_mb': 512,
u'instance_type_swap': 0, u'instance_type_vcpu_weight': None,
u'instance_type_root_gb': 1, u'instance_type_name':
u'm1.tiny', u'instance_type_id': 2,
u'instance_type_ephemeral_gb': 0,
u'instance_type_rxtx_factor': 1.0, u'image_disk_format':
u'qcow2', u'instance_type_flavorid': u'1',
u'instance_type_vcpus': 1, u'image_container_format': u'bare',
u'image_min_ram': 0, u'image_min_disk': 1,
u'image_base_image_ref':
u'11848cbf-a428-4dfb-8818-2f0a981f540b'}, u'task_state':
u'scheduling', u'shutdown_terminate': False, u'cell_name':
None, u'root_gb': 1, u'locked': False, u'name':
u'instance-00000112', u'created_at':
u'2014-02-24T08:41:08.257534', u'locked_by': None,
u'launch_index': 0, u'memory_mb': 512, u'vcpus': 1,
u'image_ref': u'11848cbf-a428-4dfb-8818-2f0a981f540b',
u'architecture': None, u'auto_disk_config': False, u'os_type':
None, u'metadata': {u'metering.server_group': u'Group_A',
u'AutoScalingGroupName': u'tyky-Group_A-55cklit7nvbq',
u'custom_metadata': u'test'}}, u'security_group':
[u'default'], u'instance_uuids':
[u'48c815ab-01c9-4ac8-9096-ac171976598c']}, u'event_type':
u'scheduler.run_instance.scheduled'}, u'counter_type':
u'delta'}</div>
<div><br>
</div>
<div>Also the following warnings/errors can be seen but they
seem to occur when auto scaling is properly working and have
no negative effects as such:</div>
<div><br>
</div>
<div><44>Feb 24 08:43:08 node-16
<U+FEFF>ceilometer-ceilometer.transformer.conversions
WARNING: dropping sample with no predecessor:
<ceilometer.sample.Sample object at 0x3774fd0></div>
<div><44>Feb 24 08:43:08 node-16
ceilometer-ceilometer.publisher.rpc AUDIT: Publishing 1
samples on metering</div>
<div><44>Feb 24 08:43:08 node-16
ceilometer-ceilometer.publisher.rpc AUDIT: Publishing 1
samples on metering</div>
<div><44>Feb 24 08:43:08 node-16
ceilometer-ceilometer.publisher.rpc AUDIT: Publishing 1
samples on metering</div>
<div><44>Feb 24 08:43:08 node-16
ceilometer-ceilometer.publisher.rpc AUDIT: Publishing 1
samples on metering</div>
<div><44>Feb 24 08:43:08 node-16
ceilometer-ceilometer.publisher.rpc AUDIT: Publishing 1
samples on metering</div>
<div><44>Feb 24 08:43:08 node-16
ceilometer-ceilometer.publisher.rpc AUDIT: Publishing 1
samples on metering</div>
<div><44>Feb 24 08:43:09 node-16
ceilometer-ceilometer.publisher.rpc AUDIT: Publishing 1
samples on metering</div>
<div><43>Feb 24 08:43:09 node-16
ceilometer-ceilometer.collector.dispatcher.database ERROR:
Failed to record metering data: not okForStor</div>
<div>age</div>
<div>Traceback (most recent call last):</div>
<div> File
"/usr/lib/python2.7/dist-packages/ceilometer/collector/dispatcher/database.py",
line 65, in record_metering_data</div>
<div> self.storage_conn.record_metering_data(meter)</div>
<div> File
"/usr/lib/python2.7/dist-packages/ceilometer/storage/impl_mongodb.py",
line 417, in record_metering_data</div>
<div> upsert=True,</div>
<div> File
"/usr/lib/python2.7/dist-packages/pymongo/collection.py", line
487, in update</div>
<div> check_keys, self.__uuid_subtype), safe)</div>
<div> File
"/usr/lib/python2.7/dist-packages/pymongo/mongo_client.py",
line 969, in _send_message</div>
<div> rv = self.__check_response_to_last_error(response)</div>
<div> File
"/usr/lib/python2.7/dist-packages/pymongo/mongo_client.py",
line 911, in __check_response_to_last_error</div>
<div> raise OperationFailure(details["err"], details["code"])</div>
<div>OperationFailure: not okForStorage</div>
<div><br>
</div>
<div>Br,</div>
<div>-Juha</div>
</div>
<br>
<fieldset></fieldset>
<br>
</div></div><div class=""><pre>_______________________________________________
Mailing list: <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a>
Post to : <a href="mailto:openstack@lists.openstack.org" target="_blank">openstack@lists.openstack.org</a>
Unsubscribe : <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a>
</pre>
</div></blockquote>
<br>
</div>
</blockquote></div><br></div>