[openstack-dev] [ceilometer] resource_metadata and metaquery

Sandy Walsh sandy.walsh at rackspace.com
Thu Jan 24 18:37:14 UTC 2013



On 01/24/2013 01:41 PM, Doug Hellmann wrote:
> 
> 
> On Thu, Jan 24, 2013 at 7:28 AM, Sandy Walsh <sandy.walsh at rackspace.com
> <mailto:sandy.walsh at rackspace.com>> wrote:
> 
>     On 01/24/2013 05:52 AM, Julien Danjou wrote:> On Thu, Jan 24 2013, Sandy
>     Walsh wrote:
>     >
>     >> This seems like a very inefficient schema requiring multiple
>     sub-queries.
>     >>
>     >> Other than the naming, is it really any different than the current
>     >> Metadata table when it comes to db performance?
>     >
>     > There's no metadata table currently, there's a metadata *column*.
>     > You can't do any filtering request based on that current column in
>     pure
>     > SQL.
> 
>     Sorry, I should have said proposed metadata table.
> 
>     >> I think a better approach would be to offer different Metric types
>     >> (extensible) which can control their own mapping to the db.
>     >
>     > I can't see how you can do that and supports a large amount of
>     different
>     > metrics type, being generic.
> 
>     I'll throw together a proposal. I think it can be done with a set of two
>     extensions:
> 
>     1. the parser part of the event consumer (please excuse my terminology
>     abuse. "agent" perhaps?)
>     2. the database portion (which would need to deal with migration, CRUD
>     and advanced query). The hard part.
> 
>     We would have to agree on a common format for passing these data
>     structures around. Probably just some base attributes + "extra" bits. It
>     would likely look like the metadata/dimension structure, but
>     under-the-hood could be handled efficiently. This structure would also
>     have a tag that would identify the "handler" needed to deal with it. A
>     datatype name, if you will.
> 
>     UI, API, aggregation, etc would all work with these generic data
>     structures.
> 
>     Honestly I don't think there would be a whole lot of them. Likely, just
>     one datatype per system (cinder, nova, quantum, etc).
> 
>     The aggregation system (aka multi-publisher) could listen for data types
>     it's interested in for roll-ups.
> 
>     The potential downside is that we could end up with one "monster
>     datatype" which is a most-common-denominator of all the important
>     attributes across all systems (cinder, nova, quantum, etc). I think
>     we're going to end up with one of these anyway once we get into the
>     multi-publisher/aggregation layers. eg: "Instance" or "Tenant"
> 
>     I think I should do up a little video showing the type of db data
>     structures we've found useful in StackTach. They're small, but
>     non-trivial. It should really illustrate what multi-publisher is going
>     to need.
> 
>     > But I think that you may want is to implement an dynamic SQL engine
>     > backend creating and indexing columns you want to request for.
>     That's a
>     > solution, but we're trying to be generic with the default sqlalchemy
>     > backend.
> 
>     Wouldn't the end effect be the same (without the large impact of an
>     index creation hit on first request)? How would we police the growth of
>     db indices?
> 
>     >> I'd be curious to see how the metadata table approach performs
>     when you
>     >> are querying on multiple keys (like Event Name + Cell + Host +
>     Request
>     >> ID, for example) with a large number of rows. Has anyone tried this?
>     >
>     > I don't think someone did. This blueprint draft was just something we
>     > talked about back then with Nick and we wrote some ideas to not forget
>     > it and have some things to discuss.
>     >
>     > The problem is that metadata are EAV and that plays badly with SQL
>     (and
>     > especially with SQL lowered down to basics thanks to ORM
>     abstraction and
>     > SQLAlchemy). It's not clear that doing splitting the metadata in
>     another
>     > table is going to be more efficient, even if data are indexed. It
>     may be
>     > faster to use SQL indexes to retrieve matching events as it is, and do
>     > the final metadata filtering at application level (i.e. in
>     > storage.impl_sqlalchemy).
> 
>     Yep, I agree EAV is bad, that's why I'm proposing a largely denormalized
>     table for the raw/underlying data types. Something easily queried on,
>     but extensible.
> 
>     >
>     > As you said, that should probably be tested.
>     >
>     > FTR I've created a blueprint on this:
>     >
>     >
>     https://blueprints.launchpad.net/ceilometer/+spec/sqlalchemy-metadata-query
>     >
> 
>     Thanks. We (RAX) are likely to be using mongodb as our backend storage
>     system as well. Perhaps there's merit in having a discussion about
>     sticking with one or the other (sql vs no-sql)?
> 
>     Having one datatype per collection would certainly make things easier on
>     #2 mentioned above (especially around the migration side).
> 
>     Thinking out loud: If we push the storage into the data type driver we
>     could likely have different storage systems per data type? (not sure if
>     that's a good thing or not)
> 
> 
> When you say "one datatype per collection" do you mean one type of
> measurement?

Yes. Sorry if I'm abusing the terminology here (not covered in
http://docs.openstack.org/developer/ceilometer/glossary.html )

Reading that paragraph again, I think I could have said it better. I was
trying to say that having a no-sql schema would make things easier all
around.

But to that point, each new data type (a metric? a measure? a counter?)
would have its own driver associated with and get stored in mongo under
a separate collection. Certainly joins would be costly. They could go
different keys in a single collection too.

(it's most likely no one would deploy in that fashion, just thinking
ahead a little where the shard key would be dependent on what's
important in the data type)

-S



> Doug
>  
> 
> 
>     -S
> 
>     _______________________________________________
>     OpenStack-dev mailing list
>     OpenStack-dev at lists.openstack.org
>     <mailto:OpenStack-dev at lists.openstack.org>
>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 



More information about the OpenStack-dev mailing list