[openstack-dev] [tripleo] Stein blueprint - Plan to remove Keepalived support (replaced by Pacemaker)

Emilien Macchi emilien at redhat.com
Wed Jul 18 20:07:08 UTC 2018


Thanks everyone for this useful feedback (I guess it helps a lot to discuss
before the PTG, so we don't even need to spend too much time on this topic).

1) Everyone agrees that undercloud HA isn't something we target now,
therefore we won't switch to Pacemaker by default.
2) Pacemaker would still be a good option for multinode/HA standalone
deployments, like we do for the overcloud.
3) Investigate how we could replace keepalived by something which would
handle the VIPs used by HAproxy.

I've abandoned the patches that tested Pacemaker on the undercloud, and
also the patch in tripleoclient for enable_pacemaker parameter, I think we
don't need it for now. There is another way to enable Pacemaker for
Standalone. I also closed the blueprint:
https://blueprints.launchpad.net/tripleo/+spec/undercloud-pacemaker-default
and created a new one:
https://blueprints.launchpad.net/tripleo/+spec/replace-keepalived-undercloud
Please take a look and let me know what you think.

It fits well with the Simplicity theme for Stein, and it'll help to remove
services that we don't need anymore.

If any feedback on this summary, please go ahead and comment.
Thanks,

On Wed, Jul 18, 2018 at 11:07 AM Dan Prince <dprince at redhat.com> wrote:

> On Tue, 2018-07-17 at 22:00 +0200, Michele Baldessari wrote:
> > Hi Jarda,
> >
> > thanks for these perspectives, this is very valuable!
> >
> > On Tue, Jul 17, 2018 at 06:01:21PM +0200, Jaromir Coufal wrote:
> > > Not rooting for any approach here, just want to add a bit of
> > > factors which might play a role when deciding which way to go:
> > >
> > > A) Performance matters, we should be improving simplicity and speed
> > > of
> > > deployments rather than making it heavier. If the deployment time
> > > and
> > > resource consumption is not significantly higher, I think it
> > > doesn’t
> > > cause an issue. But if there is a significant difference between
> > > PCMK
> > > and keepalived architecture, we would need to review that.
> >
> > +1 Should the pcmk take substantially more time then I agree, not
> > worth
> > defaulting to it. Worth also exploring how we could tweak things
> > to make the setup of the cluster a bit faster (on a single node we
> > can
> > lower certain wait operations) but full agreement on this point.
> >
> > > B) Containerization of PCMK plans - eventually we would like to run
> > > the whole undercloud/overcloud on minimal OS in containers to keep
> > > improving the operations on the nodes (updates/upgrades/etc). If
> > > because PCMK we would be forever stuck on BM, it would be a bit of
> > > pita. As Michele said, maybe we can re-visit this.
> >
> > So I briefly discussed this in our team, and while it could be
> > re-explored, we need to be very careful about the tradeoffs.
> > This would be another layer which would bring quite a bit of
> > complexity
> > (pcs commands would have to be run inside a container, speed
> > tradeoffs,
> > more limited possibilities when it comes to upgrading/updating, etc.)
> >
> > > C) Unification of undercloud/overcloud is important for us, so +1
> > > to
> > > whichever method is being used in both. But what I know, HA folks
> > > went
> > > to keepalived since it is simpler so would be good to keep in sync
> > > (and good we have their presence here actually) :)
> >
> > Right so to be honest, the choice of keepalived on the undercloud for
> > VIP predates me and I was not directly involved, so I lack the exact
> > background for that choice (and I could not quickly reconstruct it
> > from git
> > history). But I think it is/was a reasonable choice for what it needs
> > doing, although I probably would have picked just configuring the
> > extra
> > VIPs on the interfaces and have one service less to care about.
> > +1 in general on the unification, with the caveats that have been
> > discussed so far.
>
> I think it was more of that we wanted to use HAProxy for SSL
> termination and keepalived is a simple enough way to set this up.
> Instack-Undercloud has used HAProxy/keepalived for years in this
> manner.
>
> I think this came up recently because downstream we did not have a
> keepalived container. So it got a bit of spotlight on it as to why we
> were using it. We do have a keepalived RPM and its worked as it has for
> years already so as far as single node/undercloud setups go I think it
> would continue to work fine. Kolla has had and supports the keepalived
> container for awhile now as well.
>
> ---
>
> Comments on this thread seem to cover 2 main themes to me.
> Simplification and the desire to use the same architecture as the
> Overcloud (Pacemaker). And there is some competition between them.
>
> For simplification: If we can eliminate keepalived and still use
> HAProxy (thus keeping the SSL termination features working) then I
> think that would be worth trying. Specifically can we eliminate
> Keepalived without swapping in Pacemaker? Michele: if you have ideas
> here lets try them!
>
> With regards to Pacemaker I think we need to make an exception. It
> seems way too heavy for single node setups and increases the complexity
> there for very little benefit. To me the shared architecture for
> TripleO is the tools we use to setup services. By using t-h-t to drive
> our setup of the Undercloud and All-In-One installers we are already
> gaining a lot of benefit here. Pacemaker is weird as it is kind of
> augments the architecture a bit (HA architecture). But Pacemaker is
> also a service that gets configured by TripleO. So it kind of falls
> into both categories. Pacemaker gives us features we need in the
> Overcloud at the cost of some extra complexity. And in addition to all
> this we are still running the Pacemaker processes themselves on
> baremetal. All this just to say we are running the same "architecture"
> on both the Undercloud and Overcloud? I'm not a fan.
>
> Dan
>
>
>
> >
> > > D) Undercloud HA is a nice have which I think we want to get to one
> > > day, but it is not in as big demand as for example edge
> > > deployments,
> > > BM provisioning with pure OS, or multiple envs managed by single
> > > undercloud. So even though undercloud HA is important, it won’t
> > > bring
> > > operators as many benefits as the previously mentioned
> > > improvements.
> > > Let’s keep it in mind when we are considering the amount of work
> > > needed for it.
> >
> > +100
> >
> > > E) One of the use-cases we want to take into account is expanind a
> > > single-node deployment (all-in-one) to 3 node HA controller. I
> > > think
> > > it is important when evaluating PCMK/keepalived
> >
> > Right, so to be able to implement this, there is no way around having
> > pacemaker (at least today until we have galera and rabbit).
> > It still does not mean we have to default to it, but if you want to
> > scale beyond one node, then there is no other option atm.
> >
> > > HTH
> >
> > It did, thanks!
> >
> > Michele
> > > — Jarda
> > >
> > > > On Jul 17, 2018, at 05:04, Emilien Macchi <emilien at redhat.com>
> > > > wrote:
> > > >
> > > > Thanks everyone for the feedback, I've made a quick PoC:
> > > > https://review.openstack.org/#/q/topic:bp/undercloud-pacemaker-de
> > > > fault
> > > >
> > > > And I'm currently doing local testing. I'll publish results when
> > > > progress is made, but I've made it so we have the choice to
> > > > enable pacemaker (disabled by default), where keepalived would
> > > > remain the default for now.
> > > >
> > > > On Mon, Jul 16, 2018 at 2:07 PM Michele Baldessari <michele at acksy
> > > > n.org> wrote:
> > > > On Mon, Jul 16, 2018 at 11:48:51AM -0400, Emilien Macchi wrote:
> > > > > On Mon, Jul 16, 2018 at 11:42 AM Dan Prince <dprince at redhat.com
> > > > > > wrote:
> > > > > [...]
> > > > >
> > > > > > The biggest downside IMO is the fact that our Pacemaker
> > > > > > integration is
> > > > > > not containerized. Nor are there any plans to finish the
> > > > > > containerization of it. Pacemaker has to currently run on
> > > > > > baremetal
> > > > > > and this makes the installation of it for small dev/test
> > > > > > setups a lot
> > > > > > less desirable. It can launch containers just fine but the
> > > > > > pacemaker
> > > > > > installation itself is what concerns me for the long term.
> > > > > >
> > > > > > Until we have plans for containizing it I suppose I would
> > > > > > rather see
> > > > > > us keep keepalived as an option for these smaller setups. We
> > > > > > can
> > > > > > certainly change our default Undercloud to use Pacemaker (if
> > > > > > we choose
> > > > > > to do so). But having keepalived around for "lightweight"
> > > > > > (zero or low
> > > > > > footprint) installs that work is really quite desirable.
> > > > > >
> > > > >
> > > > > That's a good point, and I agree with your proposal.
> > > > > Michele, what's the long term plan regarding containerized
> > > > > pacemaker?
> > > >
> > > > Well, we kind of started evaluating it (there was definitely not
> > > > enough
> > > > time around pike/queens as we were busy landing the bundles
> > > > code), then
> > > > due to discussions around k8s it kind of got off our radar. We
> > > > can
> > > > at least resume the discussions around it and see how much effort
> > > > it
> > > > would be. I'll bring it up with my team and get back to you.
> > > >
> > > > cheers,
> > > > Michele
> > > > --
> > > > Michele Baldessari            <michele at acksyn.org>
> > > > C2A5 9DA3 9961 4FFB E01B  D0BC DDD4 DCCB 7515 5C6D
> > > >
> > > > _________________________________________________________________
> > > > _________
> > > > OpenStack Development Mailing List (not for usage questions)
> > > > Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:un
> > > > subscribe
> > > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > > >
> > > >
> > > > --
> > > > Emilien Macchi
> > > > _________________________________________________________________
> > > > _________
> > > > OpenStack Development Mailing List (not for usage questions)
> > > > Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:un
> > > > subscribe
> > > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > >
> > >
> > > ___________________________________________________________________
> > > _______
> > > OpenStack Development Mailing List (not for usage questions)
> > > Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsu
> > > bscribe
> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>


-- 
Emilien Macchi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20180718/6d9d01ea/attachment-0001.html>


More information about the OpenStack-dev mailing list