[openstack-dev] [Ceilometer] [TripleO] adding process/service monitoring

Ladislav Smola lsmola at redhat.com
Tue Jan 28 17:49:55 UTC 2014


Hello,

excellent, this is exactly what we need in Tuskar. :-)

Might be good to monitor it via SNMPD. As this daemon will be
already running on each node. And I see it should be possible, though
not very popular.

Then it would be nice to have the data stored in Ceilometer, as
it provides generic backed for storing samples and querying them.
(would be nice to have history of those samples) It should be enough
to sent it in correct format to notification bus and Ceilometer will 
store it.
For now, Tuskar would just grab it from Ceilometer.

The problem here is that every node can have different services running
so you would have to write some smart inspector that would know what
is running where. We have been talking about exposing these kind of
information in Glance, so it would return you list of services for image.
Then you would get list of nodes for image and you can poll them via SNMP.
This could be probably inspector of central agent, same approach as for
getting the baremetal metrics.

Does it sound reasonable? Or you see some critical flaws in this 
approach? :-)

Kind Regards,
Ladislav



On 01/28/2014 02:59 AM, Richard Su wrote:
> Hi,
>
> I have been looking into how to add process/service monitoring to
> tripleo. Here I want to be able to detect when an openstack dependent
> component that is deployed on an instance has failed. And when a failure
> has occurred I want to be notified and eventually see it in Tuskar.
>
> Ceilometer doesn't handle this particular use case today. So I have been
> doing some research and there are many options out there that provides
> process checks: nagios, sensu, zabbix, and monit. I am a bit wary of
> pulling one of these options into tripleo. There is some increased
> operational and maintenance costs when pulling in each of them. And
> physical device monitoring is currently in the works for Ceilometer
> lessening the need for some of the other abilities that an another
> monitoring tool would provide.
>
> For the particular use case of monitoring processes/services, at a high
> level, I am considering writing a simple daemon to perform the check.
> Checks and failures are written out as messages to the notification bus.
> Interested parties like Tuskar or Ceilometer can subscribe to these
> messages.
>
> In general does this sound like a reasonable approach?
>
> There is also the question of how to configure or figure out which
> processes we are interested in monitoring. I need to do more research
> here but I'm considering either looking at the elements listed by
> diskimage-builder or by looking at the orc post-configure.d scripts to
> find service that are restarted.
>
> I welcome your feedback and suggestions.
>
> - Richard Su
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




More information about the OpenStack-dev mailing list