[neutron] Neutron L3 HA Issue: Packet Loss and Route Flapping After Restarting openvswitch-vswitchd on Active Controller
Hi OpenStack Community, I am experiencing an issue in a 3-controller HA OpenStack environment related to Neutron L3 HA and Open vSwitch. I would appreciate your guidance or any insights. ### Environment - OpenStack version: 2025.1 - 3 controller nodes with HA setup (kolla-ansible) - Neutron L3 HA enabed - OVS + Keepalived (VRRP) - The router’s VRRP master role is currently on one controller - All controllers share the same external/provider network setup ### Issue Description When I restart the `openvswitch-vswitchd` service on the controller node that currently holds the VRRP Master for an HA router: The following problems occur: 1. ICMP traffic (ping) drops for several minutes. 2. The router’s default route begins to bounce between the other two or three controllers. 3. The VRRP master does not stabilize quickly; it keeps flipping between the remaining controllers. 4. As a result, I see packet loss and routing instability for several seconds. ### What I expect Ideally, when `openvswitch-vswitchd` restarts on the master controller: - VRRP should quickly fail over to one of the standby controllers routers - The default route should converge and remain stable - There should be no (or minimal) packet loss I should also say what when i restart neutron_l3_agent or in some cases openvswitch_db issue goes away ### Additional Information If needed, I can share: - Logs from OVS, neutron-l3-agent Thanks in advance for your support. Please let me know if more information is needed. best regards, soheil
Hi community, I’m experiencing the same issue. Is this expected behavior? Could someone please clarify?
Hi Soheil, I can't say I've seen this exact issue, but looks to be something with the OVS flow table based on your description. The best thing to do is file a bug [0] and include as much info as possible so someone can take a look. Thanks, -Brian [0] https://bugs.launchpad.net/neutron/+filebug On 12/6/25 8:49 AM, soheil.bakan@gmail.com wrote:
Hi OpenStack Community,
I am experiencing an issue in a 3-controller HA OpenStack environment related to Neutron L3 HA and Open vSwitch. I would appreciate your guidance or any insights.
### Environment - OpenStack version: 2025.1 - 3 controller nodes with HA setup (kolla-ansible) - Neutron L3 HA enabed - OVS + Keepalived (VRRP) - The router’s VRRP master role is currently on one controller - All controllers share the same external/provider network setup
### Issue Description When I restart the `openvswitch-vswitchd` service on the controller node that currently holds the VRRP Master for an HA router: The following problems occur:
1. ICMP traffic (ping) drops for several minutes. 2. The router’s default route begins to bounce between the other two or three controllers. 3. The VRRP master does not stabilize quickly; it keeps flipping between the remaining controllers. 4. As a result, I see packet loss and routing instability for several seconds.
### What I expect Ideally, when `openvswitch-vswitchd` restarts on the master controller: - VRRP should quickly fail over to one of the standby controllers routers - The default route should converge and remain stable - There should be no (or minimal) packet loss
I should also say what when i restart neutron_l3_agent or in some cases openvswitch_db issue goes away
### Additional Information If needed, I can share: - Logs from OVS, neutron-l3-agent
Thanks in advance for your support. Please let me know if more information is needed.
best regards, soheil
participants (3)
-
Brian Haley
-
hamid.lotfi@gmail.com
-
soheil.bakan@gmail.com