[openstack-dev] blueprint proposal nova-compute fencing for HA ?

Leen Besselink ubuntu at consolejunkie.net
Tue Apr 23 14:39:04 UTC 2013


On Tue, Apr 23, 2013 at 10:08:19AM -0400, Russell Bryant wrote:
> On 04/23/2013 03:31 AM, Leen Besselink wrote:
> >> I was only talking about the fencing off a compute node part, since
> >> that's what you started the thread with.  :-)
> > 
> > I know I'm going in circles, just trying to get a feel for the best way to handle it.
> > 
> >>
> >> Presumably you would still use nova APIs that already exist to move the
> >> instances elsewhere.  An 'evacuate' API went in to grizzly for this.
> >>
> >> https://blueprints.launchpad.net/nova/+spec/rebuild-for-ha
> >>
> > 
> > So when any node fails in a Pacemaker cluster, you fence the node, tell OpenStack about the
> > failed node and call evacuate for all the instances. The scheduler will just place them anywhere
> > it pleases.
> > 
> > (there is already a blueprint for evacuate to call the scheduler and even an other for handling a whole node)
> > 
> > So, yeah, maybe that is enough.
> > 
> > I guess I was hoping all machines would be the same. Now I'll need to make clusters. To OpenStack
> > they will still all look the same I guess.
> > 
> > But it will work with existing tested code, that is also important.
> 
> Yeah, it's not really ideal, but it's something that works with existing
> tools.  I thought of a pretty big hole here, though.  We want to
> restrict what a compute node can do as much as possible for security
> reasons.  Clustering them together and allowing them to communicate back
> to the nova API to perform administrative functions (evacuating
> instances) is extremely contrary to that goal.
> 
> In any case, I think the usage of fence-agents is good, but it should be
> something outside of the existing OpenStack components that uses it.
> Compute nodes need to be monitored, but they need to be restricted from
> having any administrative capabilities for security reasons.
> 
> What should perform the monitoring, fencing, evacuating, and what not is
> a bit of a question mark.  I think as a community we are seriously
> lacking good open source cloud infrastructure management tools.
> Companies are developing their own for private use, or as proprietary
> value adds, but we need some open solutions here.
> 

As Alex Glikson suggestion was to use ZooKeeper code in Nova which centralized this.

So security wise that could be a good start, you don't need anything on the node itself. If it also simpler,
you don't need to create clusters.

If OpenStack had a central service which handles fencing then that might be enough.

A simple fencing implementation would be that the service would just send an IPMI poweroff request to
the node.

Sends an other IPMI request to check if the current power state is off, marks the node as down and then calls the
scheduler to start the instances somewhere else (evacuate).

Or am I missing something ?

> -- 
> Russell Bryant
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list