<div dir="ltr"><div><div><div>I confirm that after upgrade to ceilometer havana 2013.2.3 on sles 11 sp3, these problems have been fixed, which means:<br><br></div>- restart long name ceilometer services no longer kill their brothers (similar name process), and will not duplicate themself<br>
</div>- reboot host will no longer cause ceilometer-api quit and ceilometer-collector fail<br><br></div>I will continue follow these problem just in case, but for now, they are good<br><div><div><div><br>thanks for your great work!<br>
</div></div></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Mar 5, 2014 at 12:30 PM, ZhiQiang Fan <span dir="ltr"><<a href="mailto:aji.zqfan@gmail.com" target="_blank">aji.zqfan@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div><div><div><div><div><div><div><div>Hi, SUSE developers<br><br></div>Thanks for your great work on OpenStack packaging for SUSE distribution.<br>
<br>I have tested some functionality of ceilometer on sles 11 sp3, and have two minor problem, which can create critical availability problem, but can be easily fixed . They all locate in the service init script.<br>
<br></div>* {start,check,kill}proc program base on process basename<br></div> this problem blocks alarm services, since ceilometer-alarm-evaluator and ceilometer-alarm-notifier have more than 15 prefix character exactly same, and blocks agent services in All-In-One scenario too, (ceilometer-agent-central & ceilometer-agent-compute)<br>
</div> Dirk Muller has provided a fix which using -p option to ensure the killproc will not affect another process, but I verify it on sles 11 sp3 in my all-in-one environment, and find that it does no longer kill other process, but it cannot kill its own process now, which means each time I restart the ceilometer-alarm-*, I get a new one but not replace the old one<br>
</div> I have a ugly workaround which simply shorten the ceilometer process name, it still works fine. But this problem needs to be fixed in upstream, by a better solution<br><br></div>* ceilometer-{api,collector} depend on mongodb<br>
</div> mongodb is full-feature supported in havana (thanks to SUSE developer, they backport metaquery for sql backend, even it needs to be improved too), but I've already found that ceilometer-{api,collecor} cannot behavior normal when host boot, they both complain that cannot connect to database, api process will quit but collector process will stay broken and cannot recover itself anymore even mongodb is available after boot.<br>
</div> the problem may be quite simple, even though the two services specify the mongodb as a shoud-start service, but when host boot, mongodb may already start but in an unavailable state, which cause the two services fail. I've no idea how to solve this problem in a nice way, but I just sleep 5 seconds before startproc the service's process, then everything seems fine in my little environment. this workaround is ugly too, since it sleeps each time besides host boot.<br>
<br></div>If you need any detail, I can provide more. These two problem need to be fixed seriously (maybe quickly) since it strongly affects feature availability and user experience<br><br></div><div>Thanks<br></div></div>
</blockquote></div><br><br clear="all"><br>-- <br><div dir="ltr"><div>blog: <a href="http://zqfan.github.com" target="_blank">zqfan.github.com</a><br></div>git: <a href="http://github.com/zqfan" target="_blank">github.com/zqfan</a><br>
</div>
</div>