[openstack-dev] [nova][ceilometer] model for ceilo/nova interaction going forward
Sandy Walsh
sandy.walsh at RACKSPACE.COM
Thu Nov 22 23:23:32 UTC 2012
Sorry for the top-post, but there are so many threads on this topic it's nearly impossible to keep track. So please, allow me to repeat myself here.
I've outlined a variety of proposals for solving this problem using the existing infrastructure. There's not a large departure needed from where we are today.
In a nutshell, for lifecycle/metering purposes:
1. The existing notifiers are sufficient (they don't have to use AMQP)
2. The notification event should be structured with optional components and versioned accordingly.
3. The ceilometer worker pulls in much of the nova common rpc baggage, which is unnecessary (notifications are not rpc calls). Look at the StackTach worker for an example of something lean and mean.
4. The existing usage notifications for bandwidth and state seem fine (I haven't heard objection use-cases). Any exceptions I've heard have been instrumentation or user-space based.
5. There is a hook in the virt layer for getting info out. We could easily extend this to include optional drivers from ceilometer for customer specific needs.
It just strikes me that this molehill has reached mountain status.
-S
________________________________________
From: Mark McLoughlin [markmc at redhat.com]
Sent: Thursday, November 22, 2012 3:33 PM
To: Eoghan Glynn
Cc: OpenStack Development Mailing List
Subject: Re: [openstack-dev] [nova][ceilometer] model for ceilo/nova interaction going forward
On Thu, 2012-11-22 at 13:13 -0500, Eoghan Glynn wrote:
> > Apparently there was some consensus in the thread around making the
> > nova.virt.driver interface and its implementations public and having
> > ceilometer consume that. I'm not really seeing that consensus reading
> > back.
>
> My read was that the consensus cohered in the follow-up discussion
> during the nova IRC meeting[1].
Ah, ok - we really should get into the habit of posting summaries of
discussion like that to the relevant thread. I'm as guilty of not doing
that as anyone.
> > Honestly, after looking at what parts of Nova code Ceilometer
> > currently uses, that approach looks like an awfully big
> > hammer.
>
> K, let's think about a less weighty hammer.
>
> > AFAICT, Ceilometer currently does some pretty simple and
> > generic querying of libvirt for CPU, disk and NIC information.
>
> Yes, exactly. But we want to be able to do it in the non-libvirt
> cases too. And all without copying in a rake of nova code.
Understood.
> > Even if we stick with this approach of having an agent that talks to
> > libvirt, is it really such a huge deal to have code in both Nova and
> > Ceilometer that does this?
>
> Well the way ceilo does it now (just picking up the whatever version
> of the nova is available) is turning out to be a real headache keeping
> up with internal mods to nova - the ceilo trunk regularly breaks
> because some change in nova-land has an unforeseen impact on ceilo
> (which is understandable, seeing as this code was never intended to
> externally consumable).
Understood.
> So to address that instability, are you suggesting we take a big ol'
> copy of some version (say 2012.2) of the nova virt code, prune it down
> to the bits we need, then dump that into the ceilo repo?
>
> Is the virt code sufficiently isolated within nova to do that?
> e.g. say some bit of the nova utils or config code changes, wouldn't
> that potentially have a knock-on impact?
>
> Is it OK that adding a new driver type in nova would require ceilo
> to notice and pick this up & distill out the bits needed?
No, that's not what I'm suggesting.
Take the DiskIOPollster stuff as an example:
conn = driver.load_compute_driver()
disks = conn.get_disks(instance_name)
for disk in disks:
stats = conn.block_stats(instance_name, disk)
The Nove code ceilometer is re-using is the code to connect to libvirt,
code to call conn.lookupByName() and domain.XMLDesc() to query libvirt
for the XML describing a VM, code to use xpath to extract the disk
descriptions from the XML and then code to call domain.blockStats() to
get stats about the disk.
I really don't see much issue with ceilometer just doing those few
things directly. The only private implementation assumption you'd be
making about the Nova libvirt driver is that the libvirt VM name is the
same as the instance name.
You'd be right to be concerned that you'd need to implement that same
code for other hypervisors, but here's the thing - the get_disks() and
block_stats() methods in the libvirt virt driver aren't part of the virt
driver abstraction. Other drivers neither implement them, nor do we have
common data structures for the values they'd return. In other words, the
abstraction layer with multiple implementations doesn't yet exist.
> > Put it another way, if you made a standalone generic library for
> > doing just this piece, Nova probably wouldn't bother using it since
> > so little code is involved.
>
> OK, there seemed to be a willingness for nova use it expressed in
> that IRC discussion, but yeah I take your point that it's a relatively
> small piece.
You know what's weird? The get_disks() and block_stats() methods appear
unused in Nova.
> > I think I've the same instinct as Brian on this - it doesn't seem
> > unreasonable for Nova to publish CPU/disk/NIC samples at the same
> > interval as other notifications.
>
> So IIUC the other notifications produced by nova are not really on
> a fine-grained periodic basis, being either triggered by lifecycle
> events (e.g. an instance is updated in some way) or else are very
> infrequent (e.g. the instance audit logic runs at most hourly).
>
> But in any case, one concern with nova-compute emitting these data
> as either notifications or else making direct calls into a ceilo-
> provided library was that such a loaded service is likely to
> run into timeliness issues.
If the data is purely for metering, the timeliness issue isn't a concern
right?
> > What might be confusing this whole discussion is the discussion
> > around metering and instrumentation requirements. I think we should
> > keep the two concerns completely separate for now and plough ahead
> > with what we think makes sense for metering.
>
> There's been a fair amount of fuzziness around the distinction
> between instrumentation and monitoring, but this falls into the
> latter camp for me. Its not so much about the internal dynamics
> of nova, as the user-visible behavior (e.g. is my instance running
> hot or am I pushing loads of data through the network, as opposed
> how long did that API call to update the instance take to service
> end-to-end or how many connections are in that pool).
Metering vs monitoring/instrumentation, not monitoring vs
instrumentation :)
If we were just designing a solution for metering, would we go for the
notifications option?
Cheers,
Mark.
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
More information about the OpenStack-dev
mailing list