[openstack-dev] [Ceilometer][Gnocchi] question on integration with time-series databases

Patrick Petit ppetit at mirantis.com
Thu Jun 18 09:17:17 UTC 2015


On 18 Jun 2015 at 04:44:18, gordon chung (gord at live.ca) wrote:


On 17/06/2015 12:57 PM, Chris Dent wrote: 
> On Tue, 16 Jun 2015, Simon Pasquier wrote: 
> 
>> I'm still struggling to see how these optimizations would be implemented 
>> since the current Gnocchi design has separate backends for indexing and 
>> storage which means that datapoints (id + timestamp + value) and metric 
>> metadata (tenant_id, instance_id, server group, ...) are stored into 
>> different places. I'd be interested to hear from the Gnocchi team how 
>> this 
>> is going to be tackled. For instance, does it imply modifications or 
>> extensions to the existing Gnocchi API? 
> 
> I think there's three things to keep in mind: 
> 
> a) The plan is to figure it out and make it work well, "production 
> ready" even. That will require some iteration. At the moment the 
> overlap between InfluxDB python driver maturity and someone-to-do-the- 
> work is not great. When it is I'm sure the full variety of 
> optimizations will be explored, with actual working code and test 
> cases. 

just curious but what bugs are we waiting on for the influxdb driver? 
i'm hoping Paul Dix has prioritised them? 

> 
> b) Gnocchi has separate _interfaces_ for indexing and storage. This 
> is not the same as having separate _backends_[1]. If it turns out 
> that the right way to get InfluxDB working is for it to be the 
> same backend to the two separate interfaces then that will be 
> okay. 

i'll straddle the middle line here and say i think we need to wait for a 
viable driver before we can start making the appropriate adjustments. 
having said that, i think once we have the gaps resolved, i think we 
should make all effort to conform to the rules of the db (whether it is 
influxdb, kairosdb, opentsdb). we faced a similar issue with the 
previous data storage design where we generically applied a design for 
one driver across all drivers and that led to terribly inefficient 
design everywhere. 
I'd like to emphasise that using the same backend for both data-point time-series and the identification of the resources linked to those time-series is not only the right way, it is the mandatory way. The most salient reason being that we shall not mandate other applications consuming time-series produced through Gnocchi to use anything else than the time-series backend native API. Operators who want to use InfluxDB, OpenTSDB or something else, as their time-series backend, do it for a reason. The choice of an API that best suits their needs is key to that decision. It is also a question of effectiveness. There are plenty of applications out there like Grafana that plug into those time-series out-of-the-box. I don’t think we want to force those applications to use the Gnocchi API instead.

 - Patrick



> 
> c) The future is unknown and the present is not made of stone. There 
> could be modifications and extensions to the existing stuff. We 
> don't know. Yet. 
> 
> [1] Yes the existing implementations use SQL for the indexer and 
> various subclasses of the carbonara abstraction as two backends 
> for the two interfaces. That's an accident of history not a design 
> requirement. 

-- 
gord 


__________________________________________________________________________ 
OpenStack Development Mailing List (not for usage questions) 
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe 
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150618/e2671dcc/attachment.html>


More information about the OpenStack-dev mailing list