[Openstack-operators] scaling gnocchi metricd
Ionut Biru - Fleio
ionut at fleio.com
Tue Mar 28 19:28:08 UTC 2017
Hello,
I do have a cloud under administration, my setup is fairly basic, I have deployed openstack using Openstack Ansible, currently I'm a Newton and planning to upgrade on Ocata.
I'm having a problem with gnocchi metricd falling behind on processing metrics.
Gnocchi config: https://paste.xinu.at/f73A/
In I'm using default workers number(cpu count) the number of "storage/total number of measures to process" keeps growing, last time I had 300k in queue. In seems that the tasks are not rescheduled in order to process them all in time and it processing couples of metrics after they are received from ceilometer and after that they are kept in queue and I only have 10 compute nodes with about 70 instances.
In order to process I had to set up workers to a very high number (100) and keep restarting metricd in order for them to be processed but this method is very cpu and memory intensive and luckily I found another method that works quite well.
https://git.openstack.org/cgit/openstack/gnocchi/tree/gnocchi/cli.py?h=stable/3.1#n154
I have modified TASKS_PER_WORKER and BLOCK_SIZE to 400 and now metricd keeps processing them.
I'm not sure yet if is a bug or not but my question is, how do you guys scale gnocchi metricd in order to process a lot of resources and metrics?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20170328/ba0bf3fd/attachment.html>
More information about the OpenStack-operators
mailing list