<div dir="ltr"><div dir="ltr" class="gmail_attr">пт, 28 авг. 2020 г. в 15:40, Adrian Turjak <<a href="mailto:adriant@catalystcloud.nz">adriant@catalystcloud.nz</a>>:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
But for those running Gnocchi in prod, this is likely something you may <br>
want to know about and we'd like to hear from you.<br>
</blockquote></div><div><br></div><div>Hello, everyone!<br><br>Here at Selectel we use Gnocchi as a backend for Ceilometer – we gather<br>different metrics from virtual machines and provide our customers with graphs in<br>a control panel. In this scenario we rely on Gnocchi's Keystone auth support and<br>nearly standard mappings for instances, volumes, ports, etc provided out of the<br>box.<br><br>We also use Gnocchi as a secondary target for our home-grown billing system.<br>Billing measures are gathered from different OpenStack and custom APIs,<br>go through the charging engine and then being POSTed to Gnocchi API in batches.<br>Here again we need the possibility to fetch measures with project- and domain-<br>scoped tokens on the customer side in the control panel to be able to separate scopes<br>for resellers (domain owners) and their clients (project owners). <br><br>The third way to consume Gnocchi API is through OpenStack Watcher in it's<br>strategy for balancing load in our regions. Here we use hosts metrics as well as<br>virtual machines metrics.<br><br>What do we like in Gnocchi:<br>- API is clean and easy to use, object model is universal and makes us able to<br>utilize it in different scenarios;<br>- Fast enough for our use cases;<br>- Can store metrics for a long period of time with a ceph backend with no<br>performance penalty – useful in billing case.<br><br>What we do not like:<br>- server-side aggregations do not work as one might think they should work – API<br>and CLI are very hard to use, we stopped trying to use them;<br>- very CPU and disk IO intensive, platforms are hot like hell 24/7 processing<br>not more then 1k metrics per second;<br>- sometimes deadlocks happen in Redis incoming metrics storage preventing<br>measures from certain metrics from being processed.<br><br>What are our plans for the nearest future:<br>- try to switch Watcher to Grafana backend to be able to use the same Prometheus<br>metrics we rely on for alerting and capacity planning;<br>- continue using Gnocchi only for VMs mertics, switching billing system for<br>something more reliable in terms of missed points on graphs.<br><br>Speaking about VMs metrics, it would probably be great to be able to continue<br>using Gnocchi API for customer-facing features as it works well with OpenStack<br>object model, authentication and everything. But Gnocchi's TSDB is not the best<br>on the market. By switching it to Victoria Metrics, providing Prometheus API and<br>working amazingly with Grafana, we would be able to gather and store metrics<br>with node/libvirt exporters and Prometheus doing remote writes to Victoria, and consume them via Grafana/AlertManager or<br>Gnocchi API depending on a scenario.<br></div><br>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div>Ivan Romanko</div><div>Selectel<br></div></div></div></div>