[Openstack-operators] scaling gnocchi metricd

gordon chung gord at live.ca
Wed Mar 29 16:04:09 UTC 2017



On 28/03/17 03:28 PM, Ionut Biru - Fleio wrote:
> Hello,
>
>
> I do have a cloud under administration, my setup is fairly basic, I have
> deployed openstack using Openstack Ansible, currently I'm a Newton and
> planning to upgrade on Ocata.
>
>
> I'm having a problem with gnocchi metricd falling behind on processing
> metrics.
>
>
> Gnocchi config: https://paste.xinu.at/f73A/
>
>
> In I'm using default workers number(cpu count) the number of
> "storage/total number of measures to process" keeps growing, last time I
> had 300k in queue. In seems that the tasks are not rescheduled in order
> to process them all in time and it processing couples of metrics after
> they are received from ceilometer and after that they are kept in queue
> and I only have 10 compute nodes with about 70 instances.

i should mention that a backlog isn't necessarily a bad thing. gnocchi 
has a slightly different paradigm from your classical databases. metricd 
is a process(es) which is designed to improve read performance so it 
does computation in background but it is not necessarily required. 
gnocchi can also function with a backlog by passing 'refresh=True'. this 
will tell gnocchi to do computations (if needed) at call time.

>
>
> In order to process I had to set up workers to a very high number (100)
> and keep restarting metricd in order for them to be processed but this
> method is very cpu and memory intensive and luckily I found another
> method that works quite well.
>
>
> https://git.openstack.org/cgit/openstack/gnocchi/tree/gnocchi/cli.py?h=stable/3.1#n154
>
>
> I have modified TASKS_PER_WORKER and BLOCK_SIZE to 400 and now metricd
> keeps processing them.
>
>
> I'm not sure yet if is a bug or not but my question is, how do you guys
> scale gnocchi metricd in order to process a lot of resources and metrics?
>

i mentioned this in my other reply, but you can modify TASKS_PER_WORKER 
without worry. you don't need to modify block_size, it is used to split 
work across the metricd agent's workers. if you set both 
TASKS_PER_WORKER and BLOCK_SIZE the same, it may hurt the tasks 
distribution (some workers may sit idle). if you modify 
TASKS_PER_WORKER, you should just be aware of processing_delay value 
used. again, this is probably not relevant for gnocchi 4.0 as we are 
trying to make it schedule metrics more fairly.

cheers,

-- 
gord



More information about the OpenStack-operators mailing list