[openstack-dev] [Ceilometer] [TripleO] adding process/service monitoring

Richard Su rwsu at redhat.com
Tue Jan 28 01:59:34 UTC 2014


Hi,

I have been looking into how to add process/service monitoring to
tripleo. Here I want to be able to detect when an openstack dependent
component that is deployed on an instance has failed. And when a failure
has occurred I want to be notified and eventually see it in Tuskar.

Ceilometer doesn't handle this particular use case today. So I have been
doing some research and there are many options out there that provides
process checks: nagios, sensu, zabbix, and monit. I am a bit wary of
pulling one of these options into tripleo. There is some increased
operational and maintenance costs when pulling in each of them. And
physical device monitoring is currently in the works for Ceilometer
lessening the need for some of the other abilities that an another
monitoring tool would provide.

For the particular use case of monitoring processes/services, at a high
level, I am considering writing a simple daemon to perform the check.
Checks and failures are written out as messages to the notification bus.
Interested parties like Tuskar or Ceilometer can subscribe to these
messages.

In general does this sound like a reasonable approach?

There is also the question of how to configure or figure out which
processes we are interested in monitoring. I need to do more research
here but I'm considering either looking at the elements listed by
diskimage-builder or by looking at the orc post-configure.d scripts to
find service that are restarted.

I welcome your feedback and suggestions.

- Richard Su



More information about the OpenStack-dev mailing list