[openstack-dev] [tc][ceilometer] Some background on the gnocchi project

Eoghan Glynn eglynn at redhat.com
Tue Aug 12 09:54:15 UTC 2014



> >> Doesn't InfluxDB do the same?
> > InfluxDB stores timeseries data primarily.
> >
> > Gnocchi in intended to store strongly-typed OpenStack resource
> > representations (instance, images, etc.) in addition to providing
> > a means to access timeseries data associated with those resources.
> >
> > So to answer your question: no, IIUC, it doesn't do the same thing.
> 
> Ok, I think I'm getting closer on this.

Great!

> Thanks for the clarification. Sadly, I have more questions :)

Any time, Sandy :)
 
> Is this closer? "a metadata repo for resources (instances, images, etc)
> + an abstraction to some TSDB(s)"?

Somewhat closer (more clarification below on the metadata repository
aspect, and the completeness/authority of same).

> Hmm, thinking out loud ... if it's a metadata repo for resources, who is
> the authoritative source for what the resource is? Ceilometer/Gnocchi or
> the source service?

The source service is authoritative.

> For example, if I want to query instance power state do I ask ceilometer
> or Nova?

In that scenario, you'd ask nova.

If, on the other hand, you wanted to average out the CPU utilization
over all instances with a certain metadata attribute set (e.g. some
user metadata set by Heat that indicated membership of an autoscaling
group), then you'd ask ceilometer.

> Or is it metadata about the time-series data collected for that
> resource?

Both. But the focus of my preceding remarks was on the latter.

> In which case, I think most tsdb's have some sort of "series
> description" facilities.

Sure, and those should be used for metadata related directly to
the timeseries (granularity, retention etc.)

> I guess my question is, what makes this metadata unique and how
> would it differ from the metadata ceilometer already collects?

The primary difference between the way ceilometer currently stores
metadata, is the avoidance of per-sample snapshots of resource
metadata (as stated in the initial mail on this thread).
 
> Will it be using Glance, now that Glance is becoming a pure metadata repo?

No, we have no plans to use glance for this.

By becoming a pure metadata repo, presumably you mean this spec:

  https://github.com/openstack/glance-specs/blob/master/specs/juno/metadata-schema-catalog.rst

I don't see this on the glance roadmap for Juno:

  https://blueprints.launchpad.net/glance/juno 

so presumably the integration of graffiti and glance is still more
of a longer term intent, than a present-tense "becoming".

I'm totally open to correction on this by markwash and others,
but my reading of the debate around the recent change in glance's
mission statement was that the primary focus in the immediate
term was to expand into providing an artifact repository (for
artifacts such as Heat templates), while not to *precluding* any
future expansion into also providing a metadata repository.

The fossil-record of that discussion is here:

  https://review.openstack.org/98002

> > Though of course these things are not a million miles from each
> > other, one is just a step up in the abstraction stack, having a
> > wider and more OpenStack-specific scope.
> 
> Could it be a generic timeseries service? Is it "openstack specific"
> because it uses stackforge/python/oslo?

No, I meant OpenStack-specific in terms of it understanding
something of the nature of OpenStack resources and their ownership
(e.g. instances, with some metadata, each being associated with a
user & tenant etc.)

Not OpenStack-specific in the sense that it takes dependencies from
oslo or stackforge.

As for using python: yes, gnocchi is implemented in python, like
much of the rest of OpenStack.  However, no, I don't think that
choice of implementation language makes it OpenStack-specific.

> I assume the rules and schemas will be data-driven (vs. hard-coded)?

Well one of the ideas was to move away from loosely typed
representations of resources in ceilometer, in the form of a dict
of metadata containing whatever it contains, and instead decide
upfront what was the specific minimal information per resource
type that we need to store.

> ... and since the ceilometer collectors already do the bridge work, is
> it a pre-packaging of definitions that target openstack specifically?

I'm not entirely sure of what you mean by the bridge work in
this context.

The ceilometer collector effectively acts a concentrator, by
persisting the metering messages emitted by the other ceilometer
agents (i.e. the compute, central, & notification agents) to the
metering store.

These samples are stored by the collector pretty much as-is, so
there's no real bridging going on currently in the collector (in
the sense of mapping or transforming).

However, the collector is indeed the obvious hook point for
ceilometer to emit data to gnocchi.

> (not sure about "wider and more specific")

I presume you're thinking oxymoron with "wider and more specific"?

I meant:

 * "wider" in the sense that it covers more ground than generic
   timeseries data storage

 * "more specific" in the sense that some of that extra ground is
   quite OpenStack-oriented

Does that make my intended meaning a bit clearer?

> Sorry if this was already hashed out in Atlanta.

Yes it was, at length in both the design sessions and the
"project pod".

(For background, the pod idea was a new innovation in Atlanta,
effectively a big fluid space with loosely assigned per-project
tables and informal/ emergent scheduling of discussions ... kinda
like the old unconference track, but a bit more focused, and also
open to hearing about stuff "by osmosis").

Cheers,
Eoghan



More information about the OpenStack-dev mailing list