[openstack-dev] [tripleo] Stein blueprint - Plan to remove Keepalived support (replaced by Pacemaker)

Michele Baldessari michele at acksyn.org
Wed Jul 18 20:36:23 UTC 2018


On Wed, Jul 18, 2018 at 11:07:04AM -0400, Dan Prince wrote:
> On Tue, 2018-07-17 at 22:00 +0200, Michele Baldessari wrote:
> > Hi Jarda,
> > 
> > thanks for these perspectives, this is very valuable!
> > 
> > On Tue, Jul 17, 2018 at 06:01:21PM +0200, Jaromir Coufal wrote:
> > > Not rooting for any approach here, just want to add a bit of
> > > factors which might play a role when deciding which way to go:
> > > 
> > > A) Performance matters, we should be improving simplicity and speed
> > > of
> > > deployments rather than making it heavier. If the deployment time
> > > and
> > > resource consumption is not significantly higher, I think it
> > > doesn’t
> > > cause an issue. But if there is a significant difference between
> > > PCMK
> > > and keepalived architecture, we would need to review that.
> > 
> > +1 Should the pcmk take substantially more time then I agree, not
> > worth
> > defaulting to it. Worth also exploring how we could tweak things
> > to make the setup of the cluster a bit faster (on a single node we
> > can
> > lower certain wait operations) but full agreement on this point.
> > 
> > > B) Containerization of PCMK plans - eventually we would like to run
> > > the whole undercloud/overcloud on minimal OS in containers to keep
> > > improving the operations on the nodes (updates/upgrades/etc). If
> > > because PCMK we would be forever stuck on BM, it would be a bit of
> > > pita. As Michele said, maybe we can re-visit this.
> > 
> > So I briefly discussed this in our team, and while it could be
> > re-explored, we need to be very careful about the tradeoffs.
> > This would be another layer which would bring quite a bit of
> > complexity
> > (pcs commands would have to be run inside a container, speed
> > tradeoffs,
> > more limited possibilities when it comes to upgrading/updating, etc.)
> > 
> > > C) Unification of undercloud/overcloud is important for us, so +1
> > > to
> > > whichever method is being used in both. But what I know, HA folks
> > > went
> > > to keepalived since it is simpler so would be good to keep in sync
> > > (and good we have their presence here actually) :)
> > 
> > Right so to be honest, the choice of keepalived on the undercloud for
> > VIP predates me and I was not directly involved, so I lack the exact
> > background for that choice (and I could not quickly reconstruct it
> > from git
> > history). But I think it is/was a reasonable choice for what it needs
> > doing, although I probably would have picked just configuring the
> > extra
> > VIPs on the interfaces and have one service less to care about.
> > +1 in general on the unification, with the caveats that have been
> > discussed so far.
> 
> I think it was more of that we wanted to use HAProxy for SSL
> termination and keepalived is a simple enough way to set this up.
> Instack-Undercloud has used HAProxy/keepalived for years in this
> manner.
> 
> I think this came up recently because downstream we did not have a
> keepalived container. So it got a bit of spotlight on it as to why we
> were using it. We do have a keepalived RPM and its worked as it has for
> years already so as far as single node/undercloud setups go I think it
> would continue to work fine. Kolla has had and supports the keepalived
> container for awhile now as well.
> 
> ---
> 
> Comments on this thread seem to cover 2 main themes to me.
> Simplification and the desire to use the same architecture as the
> Overcloud (Pacemaker). And there is some competition between them.
> 
> For simplification: If we can eliminate keepalived and still use
> HAProxy (thus keeping the SSL termination features working) then I
> think that would be worth trying. Specifically can we eliminate
> Keepalived without swapping in Pacemaker? Michele: if you have ideas
> here lets try them!

I don't think it makes a lot of sense to just move to native IPs on
interfaces just to remove keepalived. At least I don't see a good
trade-off. If it has worked so far, I'd say let's just keep it
(unless there are compelling arguments to remove it, of course)

> With regards to Pacemaker I think we need to make an exception. It
> seems way too heavy for single node setups and increases the complexity
> there for very little benefit. 
> To me the shared architecture for
> TripleO is the tools we use to setup services. By using t-h-t to drive
> our setup of the Undercloud and All-In-One installers we are already
> gaining a lot of benefit here. Pacemaker is weird as it is kind of
> augments the architecture a bit (HA architecture). But Pacemaker is
> also a service that gets configured by TripleO. So it kind of falls
> into both categories. Pacemaker gives us features we need in the
> Overcloud at the cost of some extra complexity. And in addition to all
> this we are still running the Pacemaker processes themselves on
> baremetal. All this just to say we are running the same "architecture"
> on both the Undercloud and Overcloud? I'm not a fan.

Fully agreed on the extra complexity, I think it is a matter of
trade-offs. The only use case mentioned by Jarda where I don't think we
can realistically get away without pcmk, is E). If we care enough about
that we should allow it to be configured in the undercloud/all-in-one
(maybe not as a default?), if we do not care about that use case (or we
come up with some other clever ideas on how to achieve it), then that
is one item off the list.

Besides E), I think a reasonable use case is to be able to have a small
all-in-one installation that mimicks a more "real-world" overcloud.
I think there is a bit of value in that, as long as the code to make it
happen is not horribly huge and complex (and I was under the impression
from Emilien's patchset that this is not the case)

After this discussion, my personal take is that offering pcmk as an
option (disabled by default) is something we should at least consider,
but I also won't be all too sad if we decide not to do it (it is always
something we can easily revisit later after all ;)


> 
> 
> > 
> > > D) Undercloud HA is a nice have which I think we want to get to one
> > > day, but it is not in as big demand as for example edge
> > > deployments,
> > > BM provisioning with pure OS, or multiple envs managed by single
> > > undercloud. So even though undercloud HA is important, it won’t
> > > bring
> > > operators as many benefits as the previously mentioned
> > > improvements.
> > > Let’s keep it in mind when we are considering the amount of work
> > > needed for it.
> > 
> > +100
> > 
> > > E) One of the use-cases we want to take into account is expanind a
> > > single-node deployment (all-in-one) to 3 node HA controller. I
> > > think
> > > it is important when evaluating PCMK/keepalived 
> > 
> > Right, so to be able to implement this, there is no way around having
> > pacemaker (at least today until we have galera and rabbit).
> > It still does not mean we have to default to it, but if you want to
> > scale beyond one node, then there is no other option atm.
> > 
> > > HTH
> > 
> > It did, thanks!
> > 
> > Michele
> > > — Jarda
> > > 
> > > > On Jul 17, 2018, at 05:04, Emilien Macchi <emilien at redhat.com>
> > > > wrote:
> > > > 
> > > > Thanks everyone for the feedback, I've made a quick PoC:
> > > > https://review.openstack.org/#/q/topic:bp/undercloud-pacemaker-de
> > > > fault
> > > > 
> > > > And I'm currently doing local testing. I'll publish results when
> > > > progress is made, but I've made it so we have the choice to
> > > > enable pacemaker (disabled by default), where keepalived would
> > > > remain the default for now.
> > > > 
> > > > On Mon, Jul 16, 2018 at 2:07 PM Michele Baldessari <michele at acksy
> > > > n.org> wrote:
> > > > On Mon, Jul 16, 2018 at 11:48:51AM -0400, Emilien Macchi wrote:
> > > > > On Mon, Jul 16, 2018 at 11:42 AM Dan Prince <dprince at redhat.com
> > > > > > wrote:
> > > > > [...]
> > > > > 
> > > > > > The biggest downside IMO is the fact that our Pacemaker
> > > > > > integration is
> > > > > > not containerized. Nor are there any plans to finish the
> > > > > > containerization of it. Pacemaker has to currently run on
> > > > > > baremetal
> > > > > > and this makes the installation of it for small dev/test
> > > > > > setups a lot
> > > > > > less desirable. It can launch containers just fine but the
> > > > > > pacemaker
> > > > > > installation itself is what concerns me for the long term.
> > > > > > 
> > > > > > Until we have plans for containizing it I suppose I would
> > > > > > rather see
> > > > > > us keep keepalived as an option for these smaller setups. We
> > > > > > can
> > > > > > certainly change our default Undercloud to use Pacemaker (if
> > > > > > we choose
> > > > > > to do so). But having keepalived around for "lightweight"
> > > > > > (zero or low
> > > > > > footprint) installs that work is really quite desirable.
> > > > > > 
> > > > > 
> > > > > That's a good point, and I agree with your proposal.
> > > > > Michele, what's the long term plan regarding containerized
> > > > > pacemaker?
> > > > 
> > > > Well, we kind of started evaluating it (there was definitely not
> > > > enough
> > > > time around pike/queens as we were busy landing the bundles
> > > > code), then
> > > > due to discussions around k8s it kind of got off our radar. We
> > > > can
> > > > at least resume the discussions around it and see how much effort
> > > > it
> > > > would be. I'll bring it up with my team and get back to you.
> > > > 
> > > > cheers,
> > > > Michele
> > > > -- 
> > > > Michele Baldessari            <michele at acksyn.org>
> > > > C2A5 9DA3 9961 4FFB E01B  D0BC DDD4 DCCB 7515 5C6D
> > > > 
> > > > _________________________________________________________________
> > > > _________
> > > > OpenStack Development Mailing List (not for usage questions)
> > > > Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:un
> > > > subscribe
> > > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > > > 
> > > > 
> > > > -- 
> > > > Emilien Macchi
> > > > _________________________________________________________________
> > > > _________
> > > > OpenStack Development Mailing List (not for usage questions)
> > > > Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:un
> > > > subscribe
> > > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > > 
> > > 
> > > ___________________________________________________________________
> > > _______
> > > OpenStack Development Mailing List (not for usage questions)
> > > Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsu
> > > bscribe
> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > 
> > 
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-- 
Michele Baldessari            <michele at acksyn.org>
C2A5 9DA3 9961 4FFB E01B  D0BC DDD4 DCCB 7515 5C6D



More information about the OpenStack-dev mailing list