[openstack-dev] [openstack][SUSE][Ceilometer] ceilometer service init script can be improved

ZhiQiang Fan aji.zqfan at gmail.com
Wed Mar 5 04:30:52 UTC 2014


Hi, SUSE developers

Thanks for your great work on OpenStack packaging for SUSE distribution.

I have tested some functionality of ceilometer on sles 11 sp3, and have two
minor problem, which can create critical availability problem, but can be
easily fixed . They all locate in the service init script.

* {start,check,kill}proc program base on process basename
  this problem blocks alarm services, since ceilometer-alarm-evaluator and
ceilometer-alarm-notifier have more than 15 prefix character exactly same,
and blocks agent services in All-In-One scenario too,
(ceilometer-agent-central & ceilometer-agent-compute)
  Dirk Muller has provided a fix which using -p option to ensure the
killproc will not affect another process, but I verify it on sles 11 sp3 in
my all-in-one environment, and find that it does no longer kill other
process, but it cannot kill its own process now, which means each time I
restart the ceilometer-alarm-*, I get a new one but not replace the old one
  I have a ugly workaround which simply shorten the ceilometer process
name, it still works fine. But this problem needs to be fixed in upstream,
by a better solution

* ceilometer-{api,collector} depend on mongodb
  mongodb is full-feature supported in havana (thanks to SUSE developer,
they backport metaquery for sql backend, even it needs to be improved too),
but I've already found that ceilometer-{api,collecor} cannot behavior
normal when host boot, they both complain that cannot connect to database,
api process will quit but collector process will stay broken and cannot
recover itself anymore even mongodb is available after boot.
  the problem may be quite simple, even though the two services specify the
mongodb as a shoud-start service, but when host boot, mongodb may already
start but in an unavailable state, which cause the two services fail. I've
no idea how to solve this problem in a nice way, but I just sleep 5 seconds
before startproc the service's process, then everything seems fine in my
little environment. this workaround is ugly too, since it sleeps each time
besides host boot.

If you need any detail, I can provide more. These two problem need to be
fixed seriously (maybe quickly) since it strongly affects feature
availability and user experience

Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140305/29f3a3d6/attachment.html>


More information about the OpenStack-dev mailing list