[Openstack-operators] Scaling Ceilometer compute agent?

Kris G. Lindgren klindgren at godaddy.com
Tue Jun 14 15:14:04 UTC 2016


Cern is running ceilometer at scale with many thousands of compute nodes.  I think their blog goes into some detail about it [1], but I don’t have a direct link to it.


[1] - http://openstack-in-production.blogspot.com/
___________________________________________________________________
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Bill Jones <bill.jones at sungardas.com<mailto:bill.jones at sungardas.com>>
Date: Tuesday, June 14, 2016 at 9:03 AM
To: "openstack-oper." <openstack-operators at lists.openstack.org<mailto:openstack-operators at lists.openstack.org>>
Subject: [Openstack-operators] Scaling Ceilometer compute agent?

Has anyone had any experience with scaling ceilometer compute agents?

We're starting to see messages like this in logs for some of our compute agents:

WARNING ceilometer.openstack.common.loopingcall [-] task <function interval_task at 0x2092cf8> run outlasted interval by 293.25 sec

This is an indication that the compute agent failed to execute its pipeline processing within the allotted interval (in our case 10 min). The result of this is that less instance samples are generated per hour than expected, and this causes billing issues for us due to the way we calculate usage.

It looks like we have three options for addressing this: make the pipeline run faster, increase the interval time, or scale the compute agents. I'm investigating the latter.

I think I read in the ceilometer architecture docs that the agents are designed to scale, but I don't see anything in the docs on how to facilitate that. Any pointers would be appreciated.

Thanks,
Bill
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20160614/3444f38b/attachment.html>


More information about the OpenStack-operators mailing list