[openstack-dev] [Ceilometer] time consuming of listing resource

Jay Pipes jaypipes at gmail.com
Sat Dec 28 16:16:47 UTC 2013


On 12/28/2013 05:51 AM, 刘胜 wrote:
> Hi all:
> I have reported a bug about time consuming of “resource-list” in
> ceilometer CLI:
> https://bugs.launchpad.net/ceilometer/+bug/1264434
>
> In order to Identify the causes of this phenomenon, I have pdb the codes
> in my invironment(configured  mysql as db driver):
> the most import part of process of listing resource is implemented in
> following codes:
>
> code of get_resources() in /ceilometer/storage/impl_sqlalchemy.py:
> …………
>   for meter, first_ts, last_ts in query.all():
>              yield api_models.Resource(
>                  resource_id=meter.resource_id,
>                  project_id=meter.project_id,
>                  first_sample_timestamp=first_ts,
>                  last_sample_timestamp=last_ts,
>                  source=meter.sources[0].id,
>                  user_id=meter.user_id,
>                  metadata=meter.resource_metadata,
>                  meter=[
>                      api_models.ResourceMeter(
>                          counter_name=m.counter_name,
>                          counter_type=m.counter_type,
>                          counter_unit=m.counter_unit,
>                      )
> for m in meter.resource.meters
>                  ],
>              )
> The method  generate iterator of object of api_models.Resource for
> ceilometer API to show.
> 1.The operation “query.all()” will query the DB table “meter” with the
> expression generated forward,in my invironment the DB table “meter” have
> more than 300000 items, so this operation may consume about 30 seconds;
> 2.The operation"for m in meter.resource.meters" will circulate the
> meters of this resource . a resource of server may have more than 100000
> meter iterms in my invironment.  So the time of whole process is too
> long. I think the meter of Resource object can be reduced and I have
> tested this modification, it is OK for listing resource,and reduce the
> most time consumption
>
> I have noticed that there are many methods of db operation may time
> consumption.
>
> ps: I have configured the ceilometer pulling interval from 600s to 60s
> in /etc/ceilometer/pipeline.yaml, but the invironment has just run 10 days!
>
> I'm a beginner of ceilometer,and want to fix this bug,but I haven't
> found a suitable way
> may be someone can help me with this?

Yep. The performance of the SQL driver in Ceilometer out-of-the-box with 
that particular line is unusable in our experience. We have our Chef 
cookbook literally patch Ceilometer's source code and comment out that 
particular line because it makes performance of Ceilometer unusable.

I hate to say it, but the SQL driver in Ceilometer really needs an 
overhaul, both at the schema level and the code level:

On the schema level:

* The indexes, especially on sourceassoc, are wrong:
  ** The order of the columns in the multi-column indexes like idx_sr, 
idx_sm, idx_su, idx_sp is incorrect. Columns used in predicates should 
*precede* columns (like source_id) that are used in joins. The way the 
indexes are structured now makes them unusable by the query optimizer 
for 99% of queries on the sourceassoc table, which means any queries on 
sourceassoc trigger a full table scan of the hundreds of millions of 
records in the table. Things are made worse by the fact that INSERT 
operations are slowed for each index on a table, and the fact that none 
of these indexes are used just means we're wasting cycles on each INSERT 
for no reason.
  ** The indexes are across the entire VARCHAR(255) field width. This 
isn't necessary (and I would argue that the base field type should be 
smaller). Index width can be reduced (and performance increased) by 
limiting the indexable width to 32 (or smaller).

The solution to the main indexing issues is to do the following:

DROP INDEX idx_sr ON sourceassoc;
CREATE INDEX idx_sr ON sourceassoc (resource_id(32), source_id(32));
DROP INDEX idx_sp ON sourceassoc;
CREATE INDEX idx_sp ON sourceassoc (project_id(32), source_id(32));
DROP INDEX idx_su ON sourceassoc;
CREATE INDEX idx_su ON sourceassoc (user_id(32), source_id(32));
DROP INDEX idx_sm ON sourceassoc;
CREATE INDEX idx_sm ON sourceassoc (meter_id, source_id(32));

Keep in mind if you have (hundreds of) millions of records in the 
sourceassoc table, the above will take a long time to run. It will take 
hours, but you'll be happy you did it. You'll see the database 
performance increase dramatically.

* The columns that refer to IDs of various kinds should not be UTF8. 
Changing these columns to a latin1 or even binary charset would cut the 
space requirements for the data and index storage by 65%. This means you 
can fit around 3x as many records in the same data and index pages. The 
more records you fit into an index page, the faster seeks and scans will be.

* sourceassoc has no primary key.

* The meter table has the following:

   KEY ix_meter_id (id)

   which is entirely redundant (id is the primary key) and does nothing 
but slow down insert operations for every record in the meter table.

* The meter table mixes frequently searched and aggregated fields (like 
timestamp, counter_type, project_id) with infrequently accessed fields 
(like resource_metadata, which is a VARCHAR(5000)). This leads to poorer 
performance of aggregate queries on the meter table that use the 
clustered index (primary key) in aggregation (for an example, see the 
particular line of code that we comment out of Ceilometer above). A 
better performing schema would consolidate slim, frequently accessed 
fields into the main meter table and move infrequently accessed or 
searched fields into a meter_extra table. This would mean many more 
records of the main meter table can fit into a single InnoDB data page 
(the clustered index), which means faster seeks and scans for 99% of 
queries on that table.

On the code level there are a variety of inefficient queries that are 
generated, and there are a number of places where using something like a 
memcache caching layer for common lookup queries could help reduce load 
on the DB server.

I'm hoping to push some patches in the early part of 2014 that address 
performance and scalability issues in the SQL driver for Ceilometer.

Best,
-jay



More information about the OpenStack-dev mailing list