[Openstack-operators] scaling gnocchi metricd

Ionut Biru - Fleio ionut at fleio.com
Wed Mar 29 06:10:53 UTC 2017


I'm not using influxdb, just basic configuration generated by openstack ansible, which enables file storage by default.


The reason for bumping those values was to process a lot of measures and 400 seems a high number at that time.


I did use the below values without any impact

metric_processing_delay = 0

metric_reporting_delay = 1

metric_cleanul_delay = 10


I'm opened to apply any configuration modification to my setup in order to resolve my issue without any code modification(that i did).

________________________________
From: Alex Krzos <akrzos at redhat.com>
Sent: Tuesday, March 28, 2017 8:19:58 PM
To: Ionut Biru - Fleio
Cc: openstack-operators at lists.openstack.org
Subject: Re: [Openstack-operators] scaling gnocchi metricd

This is interesting, thanks for sharing.  I assume your using an
influxdb storage driver correct?  I have also wondered if there was a
specific reason for the TASKS_PER_WORKER and BLOCK_SIZE values.

Also did you have to adjust your metric_processing_delay?


Alex Krzos | Performance Engineering
Red Hat
Desk: 919-754-4280
Mobile: 919-909-6266


On Tue, Mar 28, 2017 at 3:28 PM, Ionut Biru - Fleio <ionut at fleio.com> wrote:
> Hello,
>
>
> I do have a cloud under administration, my setup is fairly basic, I have
> deployed openstack using Openstack Ansible, currently I'm a Newton and
> planning to upgrade on Ocata.
>
>
> I'm having a problem with gnocchi metricd falling behind on processing
> metrics.
>
>
> Gnocchi config: https://paste.xinu.at/f73A/
>
>
> In I'm using default workers number(cpu count) the number of "storage/total
> number of measures to process" keeps growing, last time I had 300k in queue.
> In seems that the tasks are not rescheduled in order to process them all in
> time and it processing couples of metrics after they are received from
> ceilometer and after that they are kept in queue and I only have 10 compute
> nodes with about 70 instances.
>
>
> In order to process I had to set up workers to a very high number (100) and
> keep restarting metricd in order for them to be processed but this method is
> very cpu and memory intensive and luckily I found another method that works
> quite well.
>
>
> https://git.openstack.org/cgit/openstack/gnocchi/tree/gnocchi/cli.py?h=stable/3.1#n154
>
>
> I have modified TASKS_PER_WORKER and BLOCK_SIZE to 400 and now metricd keeps
> processing them.
>
>
> I'm not sure yet if is a bug or not but my question is, how do you guys
> scale gnocchi metricd in order to process a lot of resources and metrics?
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20170329/34e2abc8/attachment.html>


More information about the OpenStack-operators mailing list