[openstack-dev] [ceilometer] resource_metadata and metaquery
Monty Taylor
mordred at inaugust.com
Thu Jan 24 22:29:16 UTC 2013
On 01/25/2013 09:09 AM, Doug Hellmann wrote:
>
>
> On Thu, Jan 24, 2013 at 1:37 PM, Sandy Walsh <sandy.walsh at rackspace.com
> <mailto:sandy.walsh at rackspace.com>> wrote:
>
>
>
> On 01/24/2013 01:41 PM, Doug Hellmann wrote:
> >
> >
> > On Thu, Jan 24, 2013 at 7:28 AM, Sandy Walsh
> <sandy.walsh at rackspace.com <mailto:sandy.walsh at rackspace.com>
> > <mailto:sandy.walsh at rackspace.com
> <mailto:sandy.walsh at rackspace.com>>> wrote:
> >
> > On 01/24/2013 05:52 AM, Julien Danjou wrote:> On Thu, Jan 24
> 2013, Sandy
> > Walsh wrote:
> > >
> > >> This seems like a very inefficient schema requiring multiple
> > sub-queries.
> > >>
> > >> Other than the naming, is it really any different than the
> current
> > >> Metadata table when it comes to db performance?
> > >
> > > There's no metadata table currently, there's a metadata
> *column*.
> > > You can't do any filtering request based on that current
> column in
> > pure
> > > SQL.
> >
> > Sorry, I should have said proposed metadata table.
> >
> > >> I think a better approach would be to offer different
> Metric types
> > >> (extensible) which can control their own mapping to the db.
> > >
> > > I can't see how you can do that and supports a large amount of
> > different
> > > metrics type, being generic.
> >
> > I'll throw together a proposal. I think it can be done with a
> set of two
> > extensions:
> >
> > 1. the parser part of the event consumer (please excuse my
> terminology
> > abuse. "agent" perhaps?)
> > 2. the database portion (which would need to deal with
> migration, CRUD
> > and advanced query). The hard part.
> >
> > We would have to agree on a common format for passing these data
> > structures around. Probably just some base attributes +
> "extra" bits. It
> > would likely look like the metadata/dimension structure, but
> > under-the-hood could be handled efficiently. This structure
> would also
> > have a tag that would identify the "handler" needed to deal
> with it. A
> > datatype name, if you will.
> >
> > UI, API, aggregation, etc would all work with these generic data
> > structures.
> >
> > Honestly I don't think there would be a whole lot of them.
> Likely, just
> > one datatype per system (cinder, nova, quantum, etc).
> >
> > The aggregation system (aka multi-publisher) could listen for
> data types
> > it's interested in for roll-ups.
> >
> > The potential downside is that we could end up with one "monster
> > datatype" which is a most-common-denominator of all the important
> > attributes across all systems (cinder, nova, quantum, etc). I
> think
> > we're going to end up with one of these anyway once we get
> into the
> > multi-publisher/aggregation layers. eg: "Instance" or "Tenant"
> >
> > I think I should do up a little video showing the type of db data
> > structures we've found useful in StackTach. They're small, but
> > non-trivial. It should really illustrate what multi-publisher
> is going
> > to need.
> >
> > > But I think that you may want is to implement an dynamic SQL
> engine
> > > backend creating and indexing columns you want to request for.
> > That's a
> > > solution, but we're trying to be generic with the default
> sqlalchemy
> > > backend.
> >
> > Wouldn't the end effect be the same (without the large impact
> of an
> > index creation hit on first request)? How would we police the
> growth of
> > db indices?
> >
> > >> I'd be curious to see how the metadata table approach performs
> > when you
> > >> are querying on multiple keys (like Event Name + Cell + Host +
> > Request
> > >> ID, for example) with a large number of rows. Has anyone
> tried this?
> > >
> > > I don't think someone did. This blueprint draft was just
> something we
> > > talked about back then with Nick and we wrote some ideas to
> not forget
> > > it and have some things to discuss.
> > >
> > > The problem is that metadata are EAV and that plays badly
> with SQL
> > (and
> > > especially with SQL lowered down to basics thanks to ORM
> > abstraction and
> > > SQLAlchemy). It's not clear that doing splitting the metadata in
> > another
> > > table is going to be more efficient, even if data are
> indexed. It
> > may be
> > > faster to use SQL indexes to retrieve matching events as it
> is, and do
> > > the final metadata filtering at application level (i.e. in
> > > storage.impl_sqlalchemy).
> >
> > Yep, I agree EAV is bad, that's why I'm proposing a largely
> denormalized
> > table for the raw/underlying data types. Something easily
> queried on,
> > but extensible.
> >
> > >
> > > As you said, that should probably be tested.
> > >
> > > FTR I've created a blueprint on this:
> > >
> > >
> >
> https://blueprints.launchpad.net/ceilometer/+spec/sqlalchemy-metadata-query
> > >
> >
> > Thanks. We (RAX) are likely to be using mongodb as our backend
> storage
> > system as well. Perhaps there's merit in having a discussion about
> > sticking with one or the other (sql vs no-sql)?
> >
> > Having one datatype per collection would certainly make things
> easier on
> > #2 mentioned above (especially around the migration side).
> >
> > Thinking out loud: If we push the storage into the data type
> driver we
> > could likely have different storage systems per data type?
> (not sure if
> > that's a good thing or not)
> >
> >
> > When you say "one datatype per collection" do you mean one type of
> > measurement?
>
> Yes. Sorry if I'm abusing the terminology here (not covered in
> http://docs.openstack.org/developer/ceilometer/glossary.html )
>
> Reading that paragraph again, I think I could have said it better. I was
> trying to say that having a no-sql schema would make things easier all
> around.
>
>
> Yeah, I've found that to be the case, too. We've had some people express
> reluctance to deploy on anything other than MySQL, though, so we're
> trying to support SQL as well.
MongoDB is super troubling and problematic in some situations. It's not
a tech judgement on mongo itself - but the AGPL is untouchable by some
folks.
OTOH - If you run in to MySQL issues - I might know some people who know
a lot about it. :)
> But to that point, each new data type (a metric? a measure? a counter?)
>
>
> The terminology confusion is definitely an issue, but the fault is ours,
> not yours. Angus wasn't around to help with naming things at that point
> in the project, so we can blame him. :-)
>
> When you query the meter API you get back individual measurements that
> are called "samples" now, so let's use those terms (meter == the name of
> the thing measured and sample == the measurement). As we finish up the
> V2 API I expect that we'll update the glossary.
>
>
> would have its own driver associated with and get stored in mongo under
> a separate collection. Certainly joins would be costly. They could go
> different keys in a single collection too.
>
>
> What does that buy us? Does it make the indexing more efficient somehow,
> if the records all have more or less the same schema?
>
> Doug
>
>
>
> (it's most likely no one would deploy in that fashion, just thinking
> ahead a little where the shard key would be dependent on what's
> important in the data type)
>
>
> -S
>
>
>
> > Doug
> >
> >
> >
> > -S
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> <mailto:OpenStack-dev at lists.openstack.org>
> > <mailto:OpenStack-dev at lists.openstack.org
> <mailto:OpenStack-dev at lists.openstack.org>>
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
> >
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> <mailto:OpenStack-dev at lists.openstack.org>
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> <mailto:OpenStack-dev at lists.openstack.org>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
More information about the OpenStack-dev
mailing list