<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Tue, Nov 11, 2014 at 12:13 PM, Angus Salkeld <span dir="ltr"><<a href="mailto:asalkeld@mirantis.com" target="_blank">asalkeld@mirantis.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div><div><div><div><div>Hi all<br><br></div>The HA session was really well attended and I'd like to give some feedback from the session.<br><br></div>Firstly there is some really good content here: <a href="https://etherpad.openstack.org/p/kilo-crossproject-ha-integration" target="_blank">https://etherpad.openstack.org/p/kilo-crossproject-ha-integration</a><br></div><div><br></div>1. We SHOULD provide better health checks for OCF resources (<a href="http://linux-ha.org/wiki/OCF_Resource_Agents" target="_blank">http://linux-ha.org/wiki/OCF_Resource_Agents</a>). <br>These should be fast and reliable. We should probably bike shed on some convention like "<project>-manage healthcheck"<br></div><div>and then roll this out for each project.<br></div><div><br></div>2. We should really move <a href="https://github.com/madkiss/openstack-resource-agents" target="_blank">https://github.com/madkiss/openstack-resource-agents</a> to stackforge or openstack if the author is agreeable to it (it's referred to in our official docs).<br><br></div></div></div></div></blockquote><div><br></div><div>I have chatted to the author of this repo and he is happy for it to live under stackforge or openstack. Or each OCF resource going into each of the projects.<br></div><div>Does anyone have any particular preference? I suspect stackforge will be the path of least resistance.<br><br></div><div>-Angus<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div>3. <span>All services </span><span>SHOULD</span><span> </span><span>support</span><span> Active/Active</span><span> configurations<br></span></div><div><span>    (better scaling and it's always tested)<br></span></div><div><span><br></span></div><span>4. We should be testing HA (there are a number of ideas on the etherpad about this)<br><br></span></div><span>5. Many services </span>do not<span><span> recovery in the case of failure mid-task<br></span></span></div><span><span>    This seems like a big problem to me (some leave the DB in a mess). Someone linked to an interesting article (</span></span><br><span>crash-only-software: </span><span><a href="http://lwn.net/Articles/191059/" target="_blank">http://lwn.net/Articles/191059/)</a></span><span> that suggests that we if we do this correctly we should not need the concept of clean shutdown.<br></span><div><span><span>     (<a href="https://github.com/openstack/oslo-incubator/blob/master/openstack/common/service.py#L459-L471" target="_blank">https://github.com/openstack/oslo-incubator/blob/master/openstack/common/service.py#L459-L471</a>)<br></span></span></div><div><span><span>     I'd be interested in how people think this needs to be approached (just raise bugs for each?).<br></span></span></div><div><span><span><br></span></span></div><div><span><span>Regards<span class="HOEnZb"><font color="#888888"><br></font></span></span></span></div><span class="HOEnZb"><font color="#888888"><div><span><span>Angus<br></span></span></div></font></span></div>

</blockquote></div><br></div></div>