[openstack-dev] [Nova] Automatic evacuate

David Vossel dvossel at redhat.com
Wed Oct 22 20:05:30 UTC 2014



----- Original Message -----
> On 10/21/2014 07:53 PM, David Vossel wrote:
> >
> > ----- Original Message -----
> >>> -----Original Message-----
> >>> From: Russell Bryant [mailto:rbryant at redhat.com]
> >>> Sent: October 21, 2014 15:07
> >>> To: openstack-dev at lists.openstack.org
> >>> Subject: Re: [openstack-dev] [Nova] Automatic evacuate
> >>>
> >>> On 10/21/2014 06:44 AM, Balázs Gibizer wrote:
> >>>> Hi,
> >>>>
> >>>> Sorry for the top posting but it was hard to fit my complete view
> >>>> inline.
> >>>>
> >>>> I'm also thinking about a possible solution for automatic server
> >>>> evacuation. I see two separate sub problems of this problem:
> >>>> 1)compute node monitoring and fencing, 2)automatic server evacuation
> >>>>
> >>>> Compute node monitoring is currently implemented in servicegroup
> >>>> module of nova. As far as I understand pacemaker is the proposed
> >>>> solution in this thread to solve both monitoring and fencing but we
> >>>> tried and found out that pacemaker_remote on baremetal does not work
> >>>> together with fencing (yet), see [1]. So if we need fencing then
> >>>> either we have to go for normal pacemaker instead of pacemaker_remote
> >>>> but that solution doesn't scale or we configure and call stonith
> >>>> directly when pacemaker detect the compute node failure.
> >>> I didn't get the same conclusion from the link you reference.  It says:
> >>>
> >>> "That is not to say however that fencing of a baremetal node works any
> >>> differently than that of a normal cluster-node. The Pacemaker policy
> >>> engine
> >>> understands how to fence baremetal remote-nodes. As long as a fencing
> >>> device exists, the cluster is capable of ensuring baremetal nodes are
> >>> fenced
> >>> in the exact same way as normal cluster-nodes are fenced."
> >>>
> >>> So, it sounds like the core pacemaker cluster can fence the node to me.
> >>>   I CC'd David Vossel, a pacemaker developer, to see if he can help
> >>>   clarify.
> >> It seems there is a contradiction between chapter 1.5 and 7.2 in [1] as
> >> 7.2
> >> states:
> >> " There are some complications involved with understanding a bare-metal
> >> node's state that virtual nodes don't have. Once this logic is complete,
> >> pacemaker will be able to integrate bare-metal nodes in the same way
> >> virtual
> >> remote-nodes currently are. Some special considerations for fencing will
> >> need to be addressed. "
> >> Let's wait for David's statement on this.
> > Hey, That's me!
> >
> > I can definitely clear all this up.
> >
> > First off, this document is out of sync with the current state upstream.
> > We're
> > already past Pacemaker v1.1.12 upstream. Section 7.2 of the document being
> > referenced is still talking about future v1.1.11 features.
> >
> > I'll make it simple. If the document references anything that needs to be
> > done
> > in the future, it's already done.  Pacemaker remote is feature complete at
> > this
> > point. I've accomplished everything I originally set out to do. I see one
> > change
> > though. In 7.1 I talk about wanting pacemaker to be able to manage
> > resources in
> > containers. I mention something about libvirt sandbox. I scrapped whatever
> > I was
> > doing there. Pacemaker now has docker support.
> > https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/docker
> >
> > I've known this document is out of date. It's on my giant list of things to
> > do.
> > Sorry for any confusion.
> >
> > As far as pacemaker remote and fencing goes, remote-nodes are fenced the
> > exact
> > same way as cluster-nodes. The only consideration that needs to be made is
> > that
> > the cluster-nodes (nodes running the full pacemaker+corosync stack) are the
> > only
> > nodes allowed to initiate fencing. All you have to do is make sure the
> > fencing
> > devices you want to use to fence remote-nodes are accessible to the
> > cluster-nodes.
> >  From there you are good to go.
> >
> > Let me know if there's anything else I can clear up. Pacemaker remote was
> > designed
> > to be the solution for the exact scenario you all are discussing here.
> > Compute nodes
> > and pacemaker remote are made for one another :D
> >
> > If anyone is interested in prototyping pacemaker remote for this compute
> > node use
> > case, make sure to include me. I have done quite a bit research into how to
> > maximize
> > pacemaker's ability to scale horizontally. As part of that research I've
> > made a few
> > changes that are directly related to all of this that are not yet in an
> > official
> > pacemaker release.  Come to me for the latest rpms and you'll have a less
> > painful
> > experience setting all this up :)
> >
> > -- Vossel
> >
> >
> Hi Vossel,
> 
> Could you send us a link to the source RPMs please, we have tested on
> CentOS7. It might need a recompile.

Yes, centos 7.0 isn't going to have the rpms you need to test this.

There are a couple of things you can do.

1. I put the rhel7 related rpms I test with in this repo.
http://davidvossel.com/repo/os/el7/

*disclaimer* I only maintain this repo for myself. I'm not committed to keeping
it active or up-to-date. It just happens to be updated right now for my own use.

That will give you test rpms for the pacemaker version I'm currently using plus
the latest libqb. If you're going to do any sort of performance metrics you'll
need the latest libqb, v0.17.1

2. Build srpm from latest code on github. Right now master is relatively
stable.

# git clone https://github.com/ClusterLabs/pacemaker.git
# make srpm

The future rhel 7.1 release will have all the relevant pacemaker_remote updates.

-- Vossel

> 
> Thank you!
> 
> Geza
> >
> >
> >
> >
> >> Cheers,
> >> Gibi
> >>
> >>> --
> >>> Russell Bryant
> >>>
> >>> _______________________________________________
> >>> OpenStack-dev mailing list
> >>> OpenStack-dev at lists.openstack.org
> >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >> _______________________________________________
> >> OpenStack-dev mailing list
> >> OpenStack-dev at lists.openstack.org
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >>
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
>



More information about the OpenStack-dev mailing list