[openstack-dev] [Nova] Ceilometer vs. Nova internal metrics collector for scheduling (was: New DB column or new DB table?)

Day, Phil philip.day at hp.com
Fri Jul 19 12:47:07 UTC 2013

> -----Original Message-----
> From: Sean Dague [mailto:sean at dague.net]
> Sent: 19 July 2013 12:04
> To: OpenStack Development Mailing List
> Subject: Re: [openstack-dev] [Nova] Ceilometer vs. Nova internal metrics
> collector for scheduling (was: New DB column or new DB table?)
> On 07/19/2013 06:18 AM, Day, Phil wrote:
> > Ceilometer is a great project for taking metrics available in Nova and other
> systems and making them available for use by Operations, Billing, Monitoring,
> etc - and clearly we should try and avoid having multiple collectors of the same
> data.
> >
> > But making the Nova scheduler dependent on Ceilometer seems to be the
> wrong way round to me - scheduling is such a fundamental operation that I
> want Nova to be self sufficient in this regard.   In particular I don't want the
> availability of my core compute platform to be constrained by the availability
> of my (still evolving) monitoring system.
> >
> > If Ceilometer can be fed from the data used by the Nova scheduler then that's
> a good plus - but not the other way round.
> I assume it would gracefully degrade to the existing static allocators if
> something went wrong. If not, well that would be very bad.
> Ceilometer is an integrated project in Havana. Utilization based scheduling
> would be a new feature. I'm not sure why we think that duplicating the metrics
> collectors in new code would be less buggy than working with Ceilometer. Nova
> depends on external projects all the time.
> If we have a concern about robustness here, we should be working as an overall
> project to address that.
> 	-Sean
Just to be cleat its about a lot more than just robustness in the code - its the whole architectural pattern of putting Ceilometer at the centre of Nova scheduling that concerns me.

As I understand it Celiometer can collect metrics from more than one copy of Nova - which is good; I want to run multiple independent copies in different regions and I want to have all of my monitoring data going back to one place.   However that doesn't mean that I now also want all of those independent copies of Nova depending on that central monitoring infrastructure for something as basic as scheduling.  (I don't want to stop anyone that does either - but I don't see why I should be forced down that route).

The original change that sparked this debate came not from anything to do with utilisation based scheduling, but the pretty basic and simple desire to add new types of consumable resource counters into the scheduler logic in a more general way that having to make a DB schema change.  This was generally agreed to be a good thing, and it pains me to see that valuable work now blocked on what seems to be turning into an strategic discussion around the role of Ceilometer (Is it a monitoring tool or a fundamental metric bus, etc).

At the point where Ceilomter can be shown to replace the current scheduler resource mgmt code in Nova, then we should be talking about switching to it - but in the meantime why can't we continue to have incremental improvements in the current Nova code ?



More information about the OpenStack-dev mailing list