Open Stack

Fri Jan 10 13:10:57 UTC 2014

Hi team,

I've decided to move discussion about aggregation in mailing list.
Here is a description about my idea and I really need your comments.

*Idea:*
The goal is to improve performance when user gets statistics for meter. Now
we have fixed list of statistics (min, max and so on). During request a
user may specify the following params:
1. query
2. group_by
3. period

The idea of bp is to pre-calculate some kind of requests and store them to
a separate table in database.
The pre-calculated statistics is called aggregates. Aggregates may be
merged among each others and with any Statistics' objects.
Note, that aggregates will be transparent for users. No changes in api is
required during get_statistics.

Example:
Let's assume we have 6 Samples about 'image' meter. All of them belong to
one day (e.g. 1st May) but have happened in different times:
11.50, 12.25, 12.50, 13.25, 13.50 and 14.25.  User would like to get
statistics about this meter from start = 11.30 till end = 14.30. So we need
to process all samples.
But we may process these samples earlier and already have pre-calculated
results for full hour 12.00 and 13.00. In this case we may get  Sample
11.50 and 14.25 from "meters" table and merge statistics for them with
already calculated Statistic result from "aggregates" table.
This example "saved" only 2 reads from DB. But if we consider metrics from
pollsters with interval = 5 sec (720 Samples per hour) we will save 719
reads with aggregate usage.

*Limitations: *
Of course we cannot aggregate data for all periods, group_by's and queries.
But we may allow user to configure what queries and group_by's he or she is
interested in. For instance, it may be useful for UI where we show graph
with statistics for each hour.  I think that period should not be
configurable, period may be only hour and day.

Example of entries in db:
 image_9223372035472681807                          column=H:avg,
timestamp=1389255460712, value=1.0   (I will not copy all columns. The list
of columns is [column=H:min, column=H:max, column=H:sum and so on])
Example of filtered_aggregates in db (filter is "image by project"):
image_project_8c62fb0cd16c41498245095761b1a263_9223372035472681807
column=H:avg, timestamp=1389255460712, value=1.0

More details here: https://etherpad.openstack.org/p/ceilometer-aggregation
Draft implementation for HBase is here:
https://review.openstack.org/#/c/65681/1

Thanks for your attention,
Nadya
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140110/589be508/attachment.html>

Open Stack

[openstack-dev] [Ceilometer] Aggregation discussion

OpenStack

Community

Documentation

Branding & Legal