[openstack-dev] [ceilometer] resource_metadata and metaquery

Doug Hellmann doug.hellmann at dreamhost.com
Thu Jan 24 17:41:21 UTC 2013


On Thu, Jan 24, 2013 at 7:28 AM, Sandy Walsh <sandy.walsh at rackspace.com>wrote:

> On 01/24/2013 05:52 AM, Julien Danjou wrote:> On Thu, Jan 24 2013, Sandy
> Walsh wrote:
> >
> >> This seems like a very inefficient schema requiring multiple
> sub-queries.
> >>
> >> Other than the naming, is it really any different than the current
> >> Metadata table when it comes to db performance?
> >
> > There's no metadata table currently, there's a metadata *column*.
> > You can't do any filtering request based on that current column in pure
> > SQL.
>
> Sorry, I should have said proposed metadata table.
>
> >> I think a better approach would be to offer different Metric types
> >> (extensible) which can control their own mapping to the db.
> >
> > I can't see how you can do that and supports a large amount of different
> > metrics type, being generic.
>
> I'll throw together a proposal. I think it can be done with a set of two
> extensions:
>
> 1. the parser part of the event consumer (please excuse my terminology
> abuse. "agent" perhaps?)
> 2. the database portion (which would need to deal with migration, CRUD
> and advanced query). The hard part.
>
> We would have to agree on a common format for passing these data
> structures around. Probably just some base attributes + "extra" bits. It
> would likely look like the metadata/dimension structure, but
> under-the-hood could be handled efficiently. This structure would also
> have a tag that would identify the "handler" needed to deal with it. A
> datatype name, if you will.
>
> UI, API, aggregation, etc would all work with these generic data
> structures.
>
> Honestly I don't think there would be a whole lot of them. Likely, just
> one datatype per system (cinder, nova, quantum, etc).
>
> The aggregation system (aka multi-publisher) could listen for data types
> it's interested in for roll-ups.
>
> The potential downside is that we could end up with one "monster
> datatype" which is a most-common-denominator of all the important
> attributes across all systems (cinder, nova, quantum, etc). I think
> we're going to end up with one of these anyway once we get into the
> multi-publisher/aggregation layers. eg: "Instance" or "Tenant"
>
> I think I should do up a little video showing the type of db data
> structures we've found useful in StackTach. They're small, but
> non-trivial. It should really illustrate what multi-publisher is going
> to need.
>
> > But I think that you may want is to implement an dynamic SQL engine
> > backend creating and indexing columns you want to request for. That's a
> > solution, but we're trying to be generic with the default sqlalchemy
> > backend.
>
> Wouldn't the end effect be the same (without the large impact of an
> index creation hit on first request)? How would we police the growth of
> db indices?
>
> >> I'd be curious to see how the metadata table approach performs when you
> >> are querying on multiple keys (like Event Name + Cell + Host + Request
> >> ID, for example) with a large number of rows. Has anyone tried this?
> >
> > I don't think someone did. This blueprint draft was just something we
> > talked about back then with Nick and we wrote some ideas to not forget
> > it and have some things to discuss.
> >
> > The problem is that metadata are EAV and that plays badly with SQL (and
> > especially with SQL lowered down to basics thanks to ORM abstraction and
> > SQLAlchemy). It's not clear that doing splitting the metadata in another
> > table is going to be more efficient, even if data are indexed. It may be
> > faster to use SQL indexes to retrieve matching events as it is, and do
> > the final metadata filtering at application level (i.e. in
> > storage.impl_sqlalchemy).
>
> Yep, I agree EAV is bad, that's why I'm proposing a largely denormalized
> table for the raw/underlying data types. Something easily queried on,
> but extensible.
>
> >
> > As you said, that should probably be tested.
> >
> > FTR I've created a blueprint on this:
> >
> >
> https://blueprints.launchpad.net/ceilometer/+spec/sqlalchemy-metadata-query
> >
>
> Thanks. We (RAX) are likely to be using mongodb as our backend storage
> system as well. Perhaps there's merit in having a discussion about
> sticking with one or the other (sql vs no-sql)?
>
> Having one datatype per collection would certainly make things easier on
> #2 mentioned above (especially around the migration side).
>
> Thinking out loud: If we push the storage into the data type driver we
> could likely have different storage systems per data type? (not sure if
> that's a good thing or not)
>

When you say "one datatype per collection" do you mean one type of
measurement?

Doug


>
> -S
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20130124/c49d9481/attachment.html>


More information about the OpenStack-dev mailing list