[tc][telemetry][gnocchi] The future of Gnocchi in OpenStack

Иван Романько romanko at selectel.com
Fri Aug 28 17:52:15 UTC 2020


пт, 28 авг. 2020 г. в 15:40, Adrian Turjak <adriant at catalystcloud.nz>:

> But for those running Gnocchi in prod, this is likely something you may
> want to know about and we'd like to hear from you.
>

Hello, everyone!

Here at Selectel we use Gnocchi as a backend for Ceilometer – we gather
different metrics from virtual machines and provide our customers with
graphs in
a control panel. In this scenario we rely on Gnocchi's Keystone auth
support and
nearly standard mappings for instances, volumes, ports, etc provided out of
the
box.

We also use Gnocchi as a secondary target for our home-grown billing system.
Billing measures are gathered from different OpenStack and custom APIs,
go through the charging engine and then being POSTed to Gnocchi API in
batches.
Here again we need the possibility to fetch measures with project- and
domain-
scoped tokens on the customer side in the control panel to be able to
separate scopes
for resellers (domain owners) and their clients (project owners).

The third way to consume Gnocchi API is through OpenStack Watcher in it's
strategy for balancing load in our regions. Here we use hosts metrics as
well as
virtual machines metrics.

What do we like in Gnocchi:
- API is clean and easy to use, object model is universal and makes us able
to
utilize it in different scenarios;
- Fast enough for our use cases;
- Can store metrics for a long period of time with a ceph backend with no
performance penalty – useful in billing case.

What we do not like:
- server-side aggregations do not work as one might think they should work
– API
and CLI are very hard to use, we stopped trying to use them;
- very CPU and disk IO intensive, platforms are hot like hell 24/7
processing
not more then 1k metrics per second;
- sometimes deadlocks happen in Redis incoming metrics storage preventing
measures from certain metrics from being processed.

What are our plans for the nearest future:
- try to switch Watcher to Grafana backend to be able to use the same
Prometheus
metrics we rely on for alerting and capacity planning;
- continue using Gnocchi only for VMs mertics, switching billing system for
something more reliable in terms of missed points on graphs.

Speaking about VMs metrics, it would probably be great to be able to
continue
using Gnocchi API for customer-facing features as it works well with
OpenStack
object model, authentication and everything. But Gnocchi's TSDB is not the
best
on the market. By switching it to Victoria Metrics, providing Prometheus
API and
working amazingly with Grafana, we would be able to gather and store metrics
with node/libvirt exporters and Prometheus doing remote writes to Victoria,
and consume them via Grafana/AlertManager or
Gnocchi API depending on a scenario.

-- 
Ivan Romanko
Selectel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200828/1415ae59/attachment-0001.html>


More information about the openstack-discuss mailing list