[openstack-dev] [nova][ceilometer] model for ceilo/nova interaction going forward

Eoghan Glynn eglynn at redhat.com
Sat Nov 17 20:16:52 UTC 2012



> >> I would *never* assume anything in user space. I think monitoring
> >> of
> >> the users instances is out of scope for all things OpenStack. The
> >> users might deploy whatever tools they like for checking i/o,
> >> disk,
> >> network, etc.
> >
> >That's an interesting point of view, that I hadn't considered
> >before.
> >
> >I would see AWS CloudWatch as a user-oriented monitoring service,
> >and would have assumed that there is scope for something similar
> >to be part of openstack.
> >
> >When you say its out of scope, do you mean that it shouldn't be
> >something addressed by the IaaS fabric, and should instead be
> >something that users bolt on top by running agents *within* their
> >instances?
> >
> >(as opposed to some piece of the openstack infrastructure doing
> > this monitoring from *outside* the instance)
> >
> >I recall the Heat team ran into some complications on this whole
> >within-versus-without question, as the openstack RBAC mechanisms
> >aren't yet flexible enough to easily grant in-instance agents
> >credentials giving limited roles such as the ability to report
> >metrics to a CW-like metricstore.
> 
> Well, now we're getting to the crux of the matter and one that I
> brought up in the IRC meeting yesterday. My concern is that
> ceilometer is becoming a kitchen sink for this stuff.


OK, I can see how the recent expansion of the cielometer project's
mandate might be viewed as "mission creep" from one perspective.

However my own view is that we need to be more than a toolbox of
low-level tools to extract data from the IaaS fabric, and instead
ceilometer can and should include complete services for reasoning
over and leveraging these data (whether it be the ceilometer-api
service exposing rich views over the metering store, or a CloudWatch
clone providing access to aggregated metrics & alarming over same). 

That said, I think we can fulfill both roles - i.e. provide a home
for tools that could be used to compose higher-level services *and*
also provide implementations of some of those services.

I also like you idea of ceilo being the honest broker to drive
relevant requirements within the core projects.

Cheers,
Eoghan


> Two summits
> ago the mandate was clear "There is no billing, we need billing, so
> let's build this ..." (I remember because I suggested to look at
> Yagi/StackTach/Tach back then). This was also the impetus for the
> "integration" proposal as I saw the scope widening. To have
> ceilometer become a set of low-level tools that can be used to get
> data out and away from core OpenStack services and a basis for other
> tools to build upon (Heat, CloudWatch/Synaps, StackTach, etc). Tools
> of this sort are vitally important to a successful OpenStack
> deployment, but it should be mix-and-match or "your mileage may
> vary".
> 
> I think ceilometer should be a smaller, more tightly focused
> collection of utilities, vs. trying to be all things to all people.
> 
> If a project like Heat runs into problems with something like the
> RBAC mechanism or the polling interval from Compute, that would be
> the Ceilometer teams job to broker a solution with Core and expose
> that solution to everyone.
> 
> The Rackspace Linux/Windows agents for Xen are open sourced:
> https://github.com/rackspace/openstack-guest-agents-windows-xenserver
> https://github.com/rackspace/openstack-guest-agents-unix
> 
> That might be a starting point for an agent-level api that can feed
> into that eco-system? But again, I think it's a separate problem ...
> perhaps even a separate project?
> 
> >> For the Deployer of openstack, yes they will want to check if the
> >> Services are running hot, falling behind queued i/o or the public
> >> openstack api load balancer is spitting out 503's. My
> >> differentiation of instrumentation and monitoring in these cases
> >> are
> >> related to the sample rate and size of the payload. 503's, i/o,
> >> etc
> >> I would view as instrumentation. Sampled frequently and with
> >> little
> >> payload. This is the classic statsd/graphite data. Shown in a big
> >> graph on the wall of the Network Operations Center. Monitoring
> >> would
> >> be larger, slower, chunkier data for Capacity Planning, etc.
> >> Lifecycle state falls into the Monitoring camp as well. "Are
> >> things
> >> progressing as expected? Or will they be a problem down the road."
> >
> >OK, so this cloud-operator-oriented view is also crucial, but I'd
> >be leery about making this the entire focus of our efforts (i.e. by
> >assuming that users will sort themselves out with their own
> >monitoring
> >solution).
> 
> Yeah, it's a tricky line to cross ... stepping into user-space. There
> are a raft of 3rd party companies I'm sure would want to have a say
> in how this happens. So many OpenStack startups are centered around
> this very problem.
> 
> >Cheers,
> >Eoghan
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 



More information about the OpenStack-dev mailing list