<div dir="ltr">Okay the coffee kicked in.<div><br></div><div>I can see how my comment could be interpreted that way so let's take a step backward so I can explain my perspective here.</div><div><br></div><div>Amazon was the first to implement a commercial-grade cloud IaaS, Openstack was developed as an alternative. If we avoided wheel re-invention as a rule, Openstack would have never been written. That's how I see it. Automatic fail-over is already done by VMware. If we were looking to avoid re-invention as our guide to implementing new features, we'd setup a product referral partnership with VMware, tell our users that HA requires VMware, dust off our hands and say job well done. No one here is saying that though, but that's the mindset I think I'm hearing. I champion the in-house approach not as an effort to develop something that doesn't exist elsewhere or for the sake of control but because we don't want to be tied to a single external product for a core feature of Openstack.</div><div><br></div><div>When ProductA+ProductB = XYZ, it creates a one-way dependency that I historically try to avoid. Because if ProductA = Openstack, ProductB is no longer optional.</div><div><br></div><div>Personally speaking, I'm actually speaking more towards our approach to how we scope features for Openstack rather than whether we use Pacemaker, Nagios, Nova, Heat or something else.</div><div><br></div><div>Question: is host HA not achievable using the programs we have in place now (with modification of course)? If not, I'm still a champion to see it done within our four walls.</div><div><br></div><div>Just my 10c or so. ; )</div><div><br></div></div><div class="gmail_extra"><br clear="all"><div><div dir="ltr"><div><font><div style="font-family:arial;font-size:small"><b><i><br>Adam Lawson</i></b></div><div><font><font color="#666666" size="1"><div style="font-family:arial"><br></div><div style="font-family:arial;font-size:small">AQORN, Inc.</div><div style="font-family:arial;font-size:small">427 North Tatnall Street</div><div style="font-family:arial;font-size:small">Ste. 58461</div><div style="font-family:arial;font-size:small">Wilmington, Delaware 19801-2230</div><div style="font-family:arial;font-size:small">Toll-free: (844) 4-AQORN-NOW ext. 101</div><div style="font-family:arial;font-size:small">International: +1 302-387-4660</div></font><font color="#666666" size="1"><div style="font-family:arial;font-size:small">Direct: +1 916-246-2072</div></font></font></div></font></div><div style="font-family:arial;font-size:small"><img src="http://www.aqorn.com/images/logo.png" width="96" height="39"><br></div></div></div>
<br><div class="gmail_quote">On Thu, Oct 16, 2014 at 10:53 AM, Florian Haas <span dir="ltr"><<a href="mailto:florian@hastexo.com" target="_blank">florian@hastexo.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Thu, Oct 16, 2014 at 7:03 PM, Adam Lawson <<a href="mailto:alawson@aqorn.com">alawson@aqorn.com</a>> wrote:<br>
><br>
> Be forewarned; here's my two cents before I've had my morning coffee.<br>
><br>
> It would seem to me that if we were seeking some level of resiliency against host failures (if a host fails, evacuate the instances that were hosted on it to a host that isn't broken), it would seem that host HA is a good approach. The ultimate goal of course is instance HA but the task of monitoring individual instances and determining what constitutes "down" seems like a much more complex task than detecting when a compute node is down. I know that requiring the presence of agents should probably need some more brain-cycles since we can't expect additional bytes consuming memory on each individual VM.<br>
<br>
</span>What Russell is suggesting, though, is actually a very feasible<br>
approach for compute node HA today and per-instance HA tomorrow.<br>
<span class=""><br>
> Additionally, I'm not really hung up on the 'how' as we all realize there several ways to skin that cat, so long as that 'how' is leveraged via tools over which we have control and direct influence. Reason being, we may not want to leverage features as important as this on tools that change outside our control and subsequently shifts the foundation of the feature we implemented that was based on how the product USED to work. Basically if Pacemaker does what we need then cool but it seems that implementing a feature should be built upon a bedrock of programs over which we have a direct influence.<br>
<br>
</span>That almost sounds a bit like "let's always build a better wheel,<br>
because control". I'm not sure if that's indeed the intention, but if<br>
it is then that seems like a bad idea to me.<br>
<span class=""><br>
> This is why Nagios may be able to do it but it's a hack at best. I'm not saying Nagios isn't good or ythe hack doesn't work but in the context of an Openstack solution, we can't require a single external tool for a feature like host or VM HA. Are we suggesting that we tell people who want HA - "go use Nagios"? Call me a purist but if we're going to implement a feature, it should be our community implementing it because we have some of the best minds on staff. ; )<br>
<br>
</span>Anyone who thinks that having a monitoring solution to page people and<br>
then waking up a human to restart the service constitutes HA needs to<br>
be doused in a bucket of ice water. :)<br>
<br>
Cheers,<br>
Florian<br>
<div class="HOEnZb"><div class="h5"><br>
_______________________________________________<br>
OpenStack-dev mailing list<br>
<a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
</div></div></blockquote></div><br></div>