[openstack-dev] [ceilometer] Multiple publisher and transformer
Jiang, Yunhong
yunhong.jiang at intel.com
Fri Nov 23 02:25:15 UTC 2012
> -----Original Message-----
> From: Julien Danjou [mailto:julien at danjou.info]
> Sent: Friday, November 23, 2012 12:56 AM
> To: OpenStack Development Mailing List
> Subject: Re: [openstack-dev] [ceilometer] Multiple publisher and transformer
>
> On Thu, Nov 22 2012, Eoghan Glynn wrote:
>
> > The pollster or notification handler would basically just stuff all
> > the available raw data into the sample kwargs. The transformers would
> > then have to know which named args to expect and how to interpret the
> > resource_obj.
>
> I don't like the idea of having the transformer to guess which args is
> going to get.
Same to me. That will make the information be transformer specific.
>
> To me, the pollster or notifications handler should be responsible to
> emit the maximum amount of "counters" it can from what it gets. And for
> each counter, you would pass it through a transformer, mangling the
> value to some over thing.
Yes. And I think some changes to current Counter (or sampler as discussed in IRC). For example, in compute/libvirt.py/NetPollster, only vnic information is kept, all information about the corresponding instance, except instance_id, are removed. We need keep all information through the whole transformer, till some transformer/publisher is sure to remove them, mostly possibly the format transformer for the publisher.
>
> > From gerrit:
> >
> > The point that I had envisaged for transformers was to factor
> > out the detailed per-measurement knowledge from the corresponding
> > publisher.
> >
> > Take for example the stats related to disk I/O reported by the
> > hypervisor driver. Before these data can be pushed up to CloudWatch,
> > something has to know that the metric names are 'DiskWriteBytes',
> > 'DiskReadBytes' etc., the namespace is 'AWS/EC2', the dimensions
> > include {'InstanceId': ID}, and the unit is Bytes.
> >
> > So the idea was to avoid encoding all that knowledge in the CW
> > publisher, instead leaving the publisher simple and slow-changing
> > and unaffected by new metrics being added to the mix.
> >
> > Does that make sense at all?
>
> Yes, it totally does.
Yes. And JD's idea is good. I think the CW will include several format transformer. First is the value changes, like from disk.io to DiskWriteBytes, second is dictionary key mapping like resource_id to dimensions, the third one delete all un-needed key-value pair.
All these three steps can be generic or publisher specific depends on the implementation. And if they generic, we need pass some publisher-specific configuration to the transformer.
To me, the above 3 can be generic transformer.
The only concern is, will performance be impact if the transformer chain is too long :)
>
> The solution I'd imagine for that is to have the all pollster to emit
> something equivalent (but probably simpler) to Counter like for example
Instead of have pollster to emit simpler data, I still suggest to keep everything in Counter will pollster send out the first step data. Instead, I'd have a transformer to cut un-used filed.
> in this case:
>
> { resource_name = 'disk.io',
> resource_id = 'instanceid.vda',
> user_id = 'qwerty789',
> tenant_id = 'abcdef123',
> value = 123456 }
>
> So for CW you wouldn't transform the value, but the resource_name to
> some other thing. This could be achieved via a transformer named
> "RenameResourceName" which could be "configured" with a map:
>
> { "disk.io": "DiskWriteBytes",
> … }
>
> So it's kind of generic and you can even use it to do some other stuff.
>
> Does that make sense?
Yes.
>
> > Yeah, that's an idea. Wouldn't transformers have to be chained in that
> > case? So for example the relevant CW transformer chain would be:
> >
> > generic-transformer-calculating-cpu-util-from-cumulative-time -->
> > CW-specific-transformer-outputting-as-CPUUtilization-datapoint
> >
> > That would be fine, just wanted to call out a potential extra
> > bit of complexity to capture in the pipeline config.
>
> Yes, I think we'll need to chain them. Don't know how we can do that
> easily in our configuration file.
Followed is the configuration file in my mind:
[publisher.cw.pipe]
*=generic-value-changes:config_file, generic-key-mapping:config_file, generic-field-selection:filed_list
*.cpu = generic-transformer-calculating-cpu-util-from-cumulative-time:interval,*
The format is below, possibly should in BNF :$
Pollster_source.meter_data=transformer_name:transformer_parameters, transformer_name:transformer_parameters,.....
And "*" applies here.
Added the pollster_source because it's the only information that will not be included in the Counter (or Sampler), although not sure the useness.
Is it ok? Too complex?
>
> OTOH, I imagine that most of the transformers configuration is going to
> be pretty generic, like for CW. So we could define and provide a default
I'm not sure if they will be generic enough, like the CW changes will be publisher specific,
> JSON file with all the default pipelining set-up correctly for most
> counters and publishers.
What do you mean of "all the default pipelining"? I suppose the whole pipeline definition will be pre-defined. Of course, we will give detailed document for each transformer, so that advanced user can create pipeline configuration if needed.
Thanks
--jyh
>
> --
> Julien Danjou
> // Free Software hacker & freelance
> // http://julien.danjou.info
More information about the OpenStack-dev
mailing list