[openstack-dev] [tripleo] Stein blueprint - Plan to remove Keepalived support (replaced by Pacemaker)

Ben Nemec openstack at nemebean.com
Tue Jul 17 21:20:13 UTC 2018



On 07/17/2018 03:00 PM, Michele Baldessari wrote:
> Hi Jarda,
> 
> thanks for these perspectives, this is very valuable!
> 
> On Tue, Jul 17, 2018 at 06:01:21PM +0200, Jaromir Coufal wrote:
>> Not rooting for any approach here, just want to add a bit of factors which might play a role when deciding which way to go:
>>
>> A) Performance matters, we should be improving simplicity and speed of
>> deployments rather than making it heavier. If the deployment time and
>> resource consumption is not significantly higher, I think it doesn’t
>> cause an issue. But if there is a significant difference between PCMK
>> and keepalived architecture, we would need to review that.
> 
> +1 Should the pcmk take substantially more time then I agree, not worth
> defaulting to it. Worth also exploring how we could tweak things
> to make the setup of the cluster a bit faster (on a single node we can
> lower certain wait operations) but full agreement on this point.
> 
>> B) Containerization of PCMK plans - eventually we would like to run
>> the whole undercloud/overcloud on minimal OS in containers to keep
>> improving the operations on the nodes (updates/upgrades/etc). If
>> because PCMK we would be forever stuck on BM, it would be a bit of
>> pita. As Michele said, maybe we can re-visit this.
> 
> So I briefly discussed this in our team, and while it could be
> re-explored, we need to be very careful about the tradeoffs.
> This would be another layer which would bring quite a bit of complexity
> (pcs commands would have to be run inside a container, speed tradeoffs,
> more limited possibilities when it comes to upgrading/updating, etc.)
> 
>> C) Unification of undercloud/overcloud is important for us, so +1 to
>> whichever method is being used in both. But what I know, HA folks went
>> to keepalived since it is simpler so would be good to keep in sync
>> (and good we have their presence here actually) :)
> 
> Right so to be honest, the choice of keepalived on the undercloud for
> VIP predates me and I was not directly involved, so I lack the exact
> background for that choice (and I could not quickly reconstruct it from git
> history). But I think it is/was a reasonable choice for what it needs
> doing, although I probably would have picked just configuring the extra
> VIPs on the interfaces and have one service less to care about.
> +1 in general on the unification, with the caveats that have been
> discussed so far.

The only reason there even are vips on the undercloud is that we wanted 
ssl support, and we implemented that through the same haproxy puppet 
manifest as the overcloud, which required vips.  Keepalived happened to 
be what it was using to provide vips at the time, so that's what we 
ended up with.  There wasn't a conscious decision to use keepalived over 
anything else.

> 
>> D) Undercloud HA is a nice have which I think we want to get to one
>> day, but it is not in as big demand as for example edge deployments,
>> BM provisioning with pure OS, or multiple envs managed by single
>> undercloud. So even though undercloud HA is important, it won’t bring
>> operators as many benefits as the previously mentioned improvements.
>> Let’s keep it in mind when we are considering the amount of work
>> needed for it.
> 
> +100

I'm still of the opinion that undercloud HA shouldn't be a thing.   It 
brings with it a whole host of problems and I have yet to hear a 
realistic use case that actually requires it.  We were quite careful to 
make sure that the overcloud can continue to run indefinitely without 
the undercloud during downtime.

*Maybe* sometime in the future when those other features are implemented 
it will make more sense, but I don't think it does right now.

> 
>> E) One of the use-cases we want to take into account is expanind a
>> single-node deployment (all-in-one) to 3 node HA controller. I think
>> it is important when evaluating PCMK/keepalived
> 
> Right, so to be able to implement this, there is no way around having
> pacemaker (at least today until we have galera and rabbit).
> It still does not mean we have to default to it, but if you want to
> scale beyond one node, then there is no other option atm.
> 
>> HTH
> 
> It did, thanks!
> 
> Michele
>> — Jarda
>>
>>> On Jul 17, 2018, at 05:04, Emilien Macchi <emilien at redhat.com> wrote:
>>>
>>> Thanks everyone for the feedback, I've made a quick PoC:
>>> https://review.openstack.org/#/q/topic:bp/undercloud-pacemaker-default
>>>
>>> And I'm currently doing local testing. I'll publish results when progress is made, but I've made it so we have the choice to enable pacemaker (disabled by default), where keepalived would remain the default for now.
>>>
>>> On Mon, Jul 16, 2018 at 2:07 PM Michele Baldessari <michele at acksyn.org> wrote:
>>> On Mon, Jul 16, 2018 at 11:48:51AM -0400, Emilien Macchi wrote:
>>>> On Mon, Jul 16, 2018 at 11:42 AM Dan Prince <dprince at redhat.com> wrote:
>>>> [...]
>>>>
>>>>> The biggest downside IMO is the fact that our Pacemaker integration is
>>>>> not containerized. Nor are there any plans to finish the
>>>>> containerization of it. Pacemaker has to currently run on baremetal
>>>>> and this makes the installation of it for small dev/test setups a lot
>>>>> less desirable. It can launch containers just fine but the pacemaker
>>>>> installation itself is what concerns me for the long term.
>>>>>
>>>>> Until we have plans for containizing it I suppose I would rather see
>>>>> us keep keepalived as an option for these smaller setups. We can
>>>>> certainly change our default Undercloud to use Pacemaker (if we choose
>>>>> to do so). But having keepalived around for "lightweight" (zero or low
>>>>> footprint) installs that work is really quite desirable.
>>>>>
>>>>
>>>> That's a good point, and I agree with your proposal.
>>>> Michele, what's the long term plan regarding containerized pacemaker?
>>>
>>> Well, we kind of started evaluating it (there was definitely not enough
>>> time around pike/queens as we were busy landing the bundles code), then
>>> due to discussions around k8s it kind of got off our radar. We can
>>> at least resume the discussions around it and see how much effort it
>>> would be. I'll bring it up with my team and get back to you.
>>>
>>> cheers,
>>> Michele
>>> -- 
>>> Michele Baldessari            <michele at acksyn.org>
>>> C2A5 9DA3 9961 4FFB E01B  D0BC DDD4 DCCB 7515 5C6D
>>>
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>>
>>> -- 
>>> Emilien Macchi
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 



More information about the OpenStack-dev mailing list