[tc][telemetry][gnocchi] The future of Gnocchi in OpenStack

Zane Bitter zbitter at redhat.com
Fri Aug 28 14:48:53 UTC 2020


On 28/08/20 8:36 am, Adrian Turjak wrote:
> Hey OpenStackers,
> 
> We're currently in the process of discussing what to do with OpenStack's 
> reliance on Gnocchi, and at present it is looking like we are most 
> likely to just fork it back under a new name (currently Farfalle to 
> stick with the pasta theme).
> 
> The discussion is mostly happening here:
> https://review.opendev.org/#/c/744592/
> 
> But for those running Gnocchi in prod, this is likely something you may 
> want to know about and we'd like to hear from you.
> 
> A bit of history: Gnocchi started off as a new backend for Ceilometer in 
> OpenStack, and eventually become the defacto API for telemetry samples 
> when that was removed from Ceilometer (as backed by MongoDB). Gnocchi 
> was eventually spun off outside of OpenStack, but still essentially 
> remained our API for telemetry despite not being an official part of 
> OpenStack anymore.

I think a large part of the issue here is that there are multiple 
reasons for wanting (small-t) telemetry from OpenStack, and historically 
because of reasons they have all been conflated into one Thing with the 
result that sometimes one use case wins. At least 3 that I can think of are:

1) Monitoring the OpenStack infrastructure by the operator, including 
feeding into business processes like reporting, capacity planning &c.

2) Billing

3) Monitoring user resources by the user/application, either directly or 
via other OpenStack services like Heat or Senlin.


For the first, you just want to be able to dump data into a TSDB of the 
operator's choice. Since all of the reporting requirements are 
business-specific anyway, it's up to the operator to decide how they 
want to store the data and how they want to interact with it. It appears 
that this may have been the theory behind the Gnocchi split.

On the other hand, for the third one you really need something that 
should be an official OpenStack API with all of the attendant stability 
guarantees, because it is part of OpenStack's user interface.

The second lands somewhere in between; AIUI CloudKitty is written to 
support multiple back-ends, with OpenStack Telemetry being the primary 
one. So it needs a fairly stable API because it's consumed by other 
OpenStack projects, but it's ultimately operator-facing.


As I have argued before, when we are thinking about road maps we need to 
think of these as different use cases, and they're different enough that 
they are probably best served by least two separate tools.

Mohammed has made a compelling argument in the past that Prometheus is 
more or less the industry standard for the first use case, and we should 
just export metrics to that directly in the OpenStack services, rather 
than going through the Ceilometer collector.

I don't know what should be done about the third, but I do know that 
currently Telemetry is breaking Heat's gate and people are seriously 
discussing disabling the Telemetry-related tests, which I assume would 
mean deprecating the resources. Monasca offers an alternative, but isn't 
preferred for some distributors and operators because it brings the 
whole Java ecosystem along for the ride (managing the Python one is 
already hard enough).

cheers,
Zane.




More information about the openstack-discuss mailing list