[openstack-dev] [Heat][Ceilometer] A proposal to enhance ceilometer alarm

Qiming Teng tengqim at linux.vnet.ibm.com
Fri Jul 4 11:40:49 UTC 2014


Hi,

In current Alarm implementation, Ceilometer will send back Heat an
'alarm' using the pre-signed URL (or other channel under development).
The alarm carries a payload that looks like:

 {
   alarm_id: ID
   previous: ok
   current: alarm
   reason: transision to alarm due to n samples outside thredshold,
           most recent: .... 
   reason_data: {
     type: threshold
     disposition: inside
     count: x
     most_recent: value
   }
 }

While this data structure is useful for some simple use cases, it can be
enhanced to carry more useful data.  Some usage scenarios are:

 - When a member of AutoScalingGroup is dead (e.g. accidently deleted),
   Ceilometer can detect this from a event with count='instance',
   event_type='compute.instance.delete.end'.  If an alarm created out of
   this event, the AutoScalingGroup may have a chance to recover the
   member when appropriate.  The requirement is for this Alarm to tell
   Heat which instance is dead.
 - When a VM connected to multiple subnets is experiencing bandwidth
   problem, an alarm can be generated telling Heat which subnet is to be
   checked.

We believe there will be many other use cases expecting an alarm to
carry some 'useful' information beyond just a state transition. Below is
a proposal to solve this.  Any comments are welcomed.

1. extend the alarm with an optional parameter, say, 'output', which is
   a map or an equivalent representation.  A user can specify some
   key=value pairs using this parameter, where 'key' is a convenience
   for user and value is used to specify a field from a Sample whose
   value will be filled  in here.

   e.g. --output instance=metadata.instance_id;timestamp=timestamp

2. extend the Ceilometer alarm-evaluator service, so that when an alarm
   is seen requiring output values, it will try matching the 'value'
   specified above to the fields in a sample, and replace the output
   entry with 'key=<real_value>'.

   e.g. "output": { 
          "instance": "bd56bb53-d07f-49a6-8f60-6f8ef1336060",
	  "timestamp": "2014-07-0102: 21: 13.002155",
	}

   The above data is passed back to the alarm_url as part of its
   existing payload.

   If alarm-evaluator cannot find a matching field, it can fill in an
   empty string, or just "None".

3. extend the OS::Ceilometer::Alarm resource type in Heat so that an
   optional property (say, 'output') of type map can be used to specify
   what are expected from the Alarm.

Since it is an additional field in the 'details' argument, the impact to
existing Heat template/users will be negligible.  However, the
expressive power of carrying back additional fields would be a great
help to some scenarios we yet to know.

Because this is a cross-project proposal, comments from both communities
are valuable and thus appreciated.  If it is a viable approach, should
we raise two specs in both projects repectively?


Regards,
  - Qiming




More information about the OpenStack-dev mailing list