[openstack-dev] [Nova] Automatic evacuate

Russell Bryant rbryant at redhat.com
Thu Oct 16 17:52:28 UTC 2014


On 10/16/2014 01:03 PM, Adam Lawson wrote:
> Be forewarned; here's my two cents before I've had my morning coffee. 
> 
> It would seem to me that if we were seeking some level of resiliency
> against host failures (if a host fails, evacuate the instances that were
> hosted on it to a host that isn't broken), it would seem that host HA is
> a good approach. The ultimate goal of course is instance HA but the task
> of monitoring individual instances and determining what constitutes
> "down" seems like a much more complex task than detecting when a compute
> node is down. I know that requiring the presence of agents should
> probably need some more brain-cycles since we can't expect additional
> bytes consuming memory on each individual VM.
> 
> Additionally, I'm not really hung up on the 'how' as we all realize
> there several ways to skin that cat, so long as that 'how' is leveraged
> via tools over which we have control and direct influence. Reason being,
> we may not want to leverage features as important as this on tools that
> change outside our control and subsequently shifts the foundation of the
> feature we implemented that was based on how the product USED to work.
> Basically if Pacemaker does what we need then cool but it seems that
> implementing a feature should be built upon a bedrock of programs over
> which we have a direct influence. This is why Nagios may be able to do
> it but it's a hack at best. I'm not saying Nagios isn't good or ythe
> hack doesn't work but in the context of an Openstack solution, we can't
> require a single external tool for a feature like host or VM HA. Are we
> suggesting that we tell people who want HA - "go use Nagios"? Call me a
> purist but if we're going to implement a feature, it should be our
> community implementing it because we have some of the best minds on
> staff. ; )

I think you just gave a great example of "NIH".  :-)

I was saying "use Pacemaker", not "use Nagios".  I'm not aware of
fencing integration with Nagios, but it's feasible.  The key point I've
been making is "this is very achievable today as a function of the
infrastructure supporting an OpenStack deployment".  I'd also like to
work on some more detailed examples of doing so.

FWIW, there are existing very good relationships between OpenStack
community members and the Pacemaker team.  I'm really not concerned
about that at all.

-- 
Russell Bryant



More information about the OpenStack-dev mailing list