<div dir="ltr"><div><div><div><div><div><div>Jay,<br></div><br>Thanks for comments!<br></div>The question you raised was discussed several times within Ceilometer team but as I understand there is no official resolution yet. <br>

</div>I agree with you that statistics' collection is the main Ceilometer's goal. But at the same time there should be a way to visualize the result of Ceilometer's work.<br></div>My opinion is that the current set of data queries (get samples, get statistics) is the minimum set and it's ok to keep it as a part of Ceilometer's functionality. We need it at least for UI.<br>

</div><br></div>So, my proposal is to make this existing queries faster. Looks like our vision are the same :)<br><br><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote">

Pre-calculate when? :) During processing of samples, or during some<br>

periodic job?</blockquote><div><br></div><div>It should be a periodic job, right. <br></div><br><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote"><div>

<div>The term "aggregate" really just means a generic grouping or<br>

summarization. If you are looking for a term that represents the<br>

rules/heuristics for maintaining rolling calculations, perhaps the term<br>

"report" is better?</div></div></blockquote><div><br></div><div>Hmm, I think that 'aggregate' is ok. In "my terminology" an aggregate is a ready set of statistics for concrete meter, period and query. Anyway, will think about it. <br>

<br></div><div>Thanks, <br>Nadya<br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, Jan 13, 2014 at 2:13 AM, Jay Pipes <span dir="ltr"><<a href="mailto:jaypipes@gmail.com" target="_blank">jaypipes@gmail.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">On Fri, 2014-01-10 at 17:10 +0400, Nadya Privalova wrote:<br>

> Idea:<br>

> The goal is to improve performance when user gets statistics for<br>

> meter. Now we have fixed list of statistics (min, max and so on).<br>

> During request a user may specify the following params:<br>

> 1. query<br>

> 2. group_by<br>

> 3. period<br>

><br>

> The idea of bp is to pre-calculate some kind of requests and store<br>

> them to a separate table in database.<br>

<br>

</div>Pre-calculate when? :) During processing of samples, or during some<br>

periodic job?<br>

<div class="im"><br>

> The pre-calculated statistics is called aggregates.<br>

<br>

</div>The term "aggregate" really just means a generic grouping or<br>

summarization. If you are looking for a term that represents the<br>

rules/heuristics for maintaining rolling calculations, perhaps the term<br>

"report" is better?<br>

<div class="im"><br>

> Aggregates may be merged among each others and with any Statistics'<br>

> objects.<br>

> Note, that aggregates will be transparent for users. No changes in api<br>

> is required during get_statistics.<br>

><br>

> Example:<br>

> Let's assume we have 6 Samples about 'image' meter. All of them belong<br>

> to one day (e.g. 1st May) but have happened in different times:<br>

> 11.50, 12.25, 12.50, 13.25, 13.50 and 14.25.  User would like to get<br>

> statistics about this meter from start = 11.30 till end = 14.30. So we<br>

> need to process all samples.<br>

> But we may process these samples earlier and already have<br>

> pre-calculated results for full hour 12.00 and 13.00. In this case we<br>

> may get  Sample 11.50 and 14.25 from "meters" table and merge<br>

> statistics for them with already calculated Statistic result from<br>

> "aggregates" table.<br>

> This example "saved" only 2 reads from DB. But if we consider metrics<br>

> from pollsters with interval = 5 sec (720 Samples per hour) we will<br>

> save 719 reads with aggregate usage.<br>

<br>

</div>Hmm. So, aggregation and grouping are the domain of data warehousing and<br>

OLAP Servers. I don't believe that putting this functionality directly<br>

in to Ceilometer is a good idea. I believe it would be better to<br>

delegate this kind of functionality to well-known and used tools like<br>

Pentaho [1], which can use a variety of different backend storage<br>

systems.<br>

<br>

Bottom line, I believe Ceilometer should focus strictly on the<br>

collection of samples/alarms, the pre-processing of those things, and<br>

the storage of those things, however I do not believe that Ceilometer<br>

should become an OLAP analytics tool when existing ones already fill<br>

that need.<br>

<br>

Best,<br>

-jay<br>

<br>

[1] <a href="http://www.pentaho.com/5.0" target="_blank">http://www.pentaho.com/5.0</a> &<br>

<a href="http://en.wikipedia.org/wiki/Mondrian_OLAP_server" target="_blank">http://en.wikipedia.org/wiki/Mondrian_OLAP_server</a><br>

<br>

<br>

<br>

_______________________________________________<br>

OpenStack-dev mailing list<br>

<a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>

<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

</blockquote></div><br></div>