<div>                What's the recommended method for rebooting controllers? Do we need to use the "remove from cluster" and "add to cluster" procedures or is there a better way?<br><br>https://docs.openstack.org/kolla-ansible/train/user/adding-and-removing-hosts.html<br>            </div>            <div class="yahoo_quoted" style="margin:10px 0px 0px 0.8ex;border-left:1px solid #ccc;padding-left:1ex;">                        <div style="font-family:'Helvetica Neue', Helvetica, Arial, sans-serif;font-size:13px;color:#26282a;">                                <div>                    On Friday, May 12, 2023, 03:04:26 PM EDT, Albert Braden <ozzzo@yahoo.com> wrote:                </div>                <div><br></div>                <div><br></div>                <div><div id="yiv5003265733"><div><div>                We use keepalived and exabgp to manage failover for haproxy. That works but it takes a few minutes, and during those few minutes customers experience impact. We tell them to not build/delete VMs during patching, but they still do, and then complain about the failures.<br clear="none"><br clear="none">We're planning to experiment with adding a "manual" haproxy failover to our patching automation, but I'm wondering if there is anything on the controller that needs to be failed over or disabled before rebooting the KVM. I looked at the "remove from cluster" and "add to cluster" procedures but that seems unnecessarily cumbersome for rebooting the KVM.<br clear="none">            </div>            <div id="yiv5003265733yqt93307" class="yiv5003265733yqt2871447346"><div style="margin:10px 0px 0px 0.8ex;border-left:1px solid #ccc;padding-left:1ex;" class="yiv5003265733yahoo_quoted">                        <div style="font-family:'Helvetica Neue', Helvetica, Arial, sans-serif;font-size:13px;color:#26282a;">                                <div>                    On Friday, May 12, 2023, 03:42:42 AM EDT, Eugen Block <eblock@nde.ag> wrote:                </div>                <div><br clear="none"></div>                <div><br clear="none"></div>                <div>Hi Albert,<br clear="none"><br clear="none">how is your haproxy placement controlled, something like pacemaker or  <br clear="none">similar? I would always do a failover when I'm aware of interruptions  <br clear="none">(maintenance window), that should speed things up for clients. We have  <br clear="none">a pacemaker controlled HA control plane, it takes more time until  <br clear="none">pacemaker realizes that the resource is gone if I just rebooted a  <br clear="none">server without failing over. I have no benchmarks though. There's  <br clear="none">always a risk of losing a couple of requests during the failover but  <br clear="none">we didn't have complaints yet, I believe most of the components try to  <br clear="none">resend the lost messages. In one of our customer's cluster with many  <br clear="none">resources (they also use terraform) I haven't seen issues during a  <br clear="none">regular maintenance window. When they had a DNS outage a few months  <br clear="none">back it resulted in a mess, manual cleaning was necessary, but the  <br clear="none">regular failovers seem to work just fine.<br clear="none">And I don't see rabbitmq issues either after rebooting a server,  <br clear="none">usually the haproxy (and virtual IP) failover suffice to prevent  <br clear="none">interruptions.<br clear="none"><br clear="none">Regards,<br clear="none">Eugen<br clear="none"><br clear="none">Zitat von Satish Patel <<a rel="nofollow noopener noreferrer" shape="rect" ymailto="mailto:satish.txt@gmail.com" target="_blank" href="mailto:satish.txt@gmail.com">satish.txt@gmail.com</a>>:<br clear="none"><br clear="none">> Are you running your stack on top of the kvm virtual machine? How many<br clear="none">> controller nodes do you have? mostly rabbitMQ causing issues if you restart<br clear="none">> controller nodes.<br clear="none">><br clear="none">> On Thu, May 11, 2023 at 8:34 AM Albert Braden <<a rel="nofollow noopener noreferrer" shape="rect" ymailto="mailto:ozzzo@yahoo.com" target="_blank" href="mailto:ozzzo@yahoo.com">ozzzo@yahoo.com</a>> wrote:<br clear="none">><br clear="none">>> We have our haproxy and controller nodes on KVM hosts. When those KVM<br clear="none">>> hosts are restarted, customers who are building or deleting VMs see impact.<br clear="none">>> VMs may go into error status, fail to get DNS records, fail to delete, etc.<br clear="none">>> The obvious reason is because traffic that is being routed to the haproxy<br clear="none">>> on the restarting KVM is lost. If we manually fail over haproxy before<br clear="none">>> restarting the KVM, will that be sufficient to stop traffic being lost, or<br clear="none">>> do we also need to do something with the controller?<br clear="none">>><br clear="none">>><br clear="none"><br clear="none"><br clear="none"><br clear="none"><br clear="none"></div>            </div>                </div></div></div></div></div>            </div>                </div>