[Openstack-operators] Openstack telemetry stuff

Joe Topjian joe at topjian.net
Mon Nov 10 21:51:41 UTC 2014


Hi Kris,

We're not collecting metrics on api requests, response times, etc, but we
do collect a fair amount of stats about the hypervisors and cloud usage.

We use Sensu as our metric collector and it works pretty well. It feeds the
metrics into Rabbit where graphite then picks them up. We have another
cloud that uses Collectd and I really have no complaints about that either.
Diamond and Ganglia are other options.

We use Grafana for rendering both whisper and rrd-based graphs.

For general health monitoring/metrics, we've found everything we need in
the sensu community repository
<https://github.com/sensu/sensu-community-plugins>. Most of the stuff in
that repo can be modified for other systems if you don't use Sensu.

For cloud usage, I have a script that runs "nova usage-list" daily, parses
the results, and feeds it into graphite (something along the lines of
"projects.uuid.cloud_usage.instances".

I've also been experimenting with polling the actual usage of individual
vms. I'm running this script
<https://github.com/osops/tools-generic/blob/master/libvirt/instance_metrics.rb>
on all of my compute nodes which returns stats about individual instances.
A second script then connects the instance to a project and stores it
either in graphite or collectd.

All of our whisper and rrd files are stored on ZFS. Whisper and rrd are
both fixed-size databases and ZFS compression works tremendously well. For
example, the retention configuration we have for graphite makes each
whisper file (a single metric) approximately 3.4mb. With compression, it
becomes approximately 400kb.

Hope that helps,
Joe

On Mon, Nov 10, 2014 at 10:23 PM, Kris G. Lindgren <klindgren at godaddy.com>
wrote:

>   Hello Operators,
>
>  Was wondering what you are using to gather Openstack telemetry metrics?
>
>  Was looking at things around Openstack serivce api requests/s, response
> times (if possible), errors/s, Rabbitmq metrics, if possible pending or
> tasks that are in progress, ect ect.   Basically your more advanced and yet
> basic monitoring around the openstack services.  We run an ELK (elastic
> search, logstash, kibana) infrastructure and was wondering if anyone was a
> statsd/graphite output config for openstack log data -> graphite to gather
> some of the metrics?  If not how are you currently doing it?
>
>  Additionally, can anyone share what you are doing to get hypervisor
> health/vm statistics?  We are running ceilometer, but I haven't been very
> happy with the results.  I am also thinking something like for
> statsd/graphite here as well.  But if you have something that works for you
> and can share it.  Please DO!!  ANYTHING is welcome.
>  ____________________________________________
>
> Kris Lindgren
> Senior Linux Systems Engineer
> GoDaddy, LLC.
>
>  This email message and any attachment(s) hereto are intended for use
> only by its intended recipient(s) and may contain confidential information.
> If you have received this email in error, please immediately notify the
> sender and permanently delete the original and any copy of this message and
> its attachments.
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20141110/73f9f671/attachment.html>


More information about the OpenStack-operators mailing list