Neutron + OVN raft cluster

Mohammed Naser mnaser at vexxhost.com
Fri May 6 21:21:42 UTC 2022


Hi Tiago,

Have you seen this?

https://bugs.launchpad.net/nova/+bug/1969592

Mohammed

On Fri, May 6, 2022 at 3:56 PM Tiago Pires <tiagohp at gmail.com> wrote:
>
> Hi all,
>
> I was checking the mail list history and this thread https://mail.openvswitch.org/pipermail/ovs-discuss/2018-March/046438.html caught my attention about raft ovsdb clustering.
> In my setup (OVN 20.03 and Openstack Ussuri) on the ovn-controller we have configured the ovn-remote="tcp:10.2X.4X.4:6642,tcp:10.2X.4X.68:6642,tcp:10.2X.4X.132:6642" with the 3 OVN central member that they are in cluster mode.
> Also on the neutron ML2 side:
> [ovn]
> ovn_native_dhcp = True
> ovn_nb_connection = tcp:10.2X.4X.4:6641,tcp:10.2X.4X.68:6641,tcp:10.2X.4X.132:6641
> ovn_sb_connection = tcp:10.2X.4X.4:6642,tcp:10.2X.4X.68:6642,tcp:10.2X.4X.132:6642
>
> We are experiencing an issue with Neutron when the OVN leader decide to take a snapshot and by design another member became leader(more less every 8 minutes):
> 2022-05-05T16:57:42.135Z|17401|raft|INFO|Transferring leadership to write a snapshot.
>
> ovs-appctl -t /var/run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound
> 4a03
> Name: OVN_Southbound
> Cluster ID: ca74 (ca744caf-40cd-4751-a2f2-86e35ad6541c)
> Server ID: 4a03 (4a0328dc-e9a4-495e-a4f1-0a0340fc6d19)
> Address: tcp:10.2X.4X.132:6644
> Status: cluster member
> Role: leader
> Term: 1912
> Leader: self
> Vote: self
>
> Election timer: 10000
> Log: [497643, 498261]
> Entries not yet committed: 0
> Entries not yet applied: 0
> Connections: ->3d6c ->4ef0 <-3d6c <-4ef0
> Servers:
>     4a03 (4a03 at tcp:10.2X.4X.132:6644) (self) next_index=497874 match_index=498260
>     3d6c (3d6c at tcp:10.2X.4X.68:6644) next_index=498261 match_index=498260
>     4ef0 (4ef0 at tcp:10.2X.4X.4:6644) next_index=498261 match_index=498260
>
> As I understood the tcp connections from the Neutron (NB) and ovn-controllers (SB) to OVN Central are established only with the leader:
>
> #OVN central leader
> $ netstat -nap | grep 6642| more
>
> tcp        0      0 0.0.0.0:6642            0.0.0.0:*               LISTEN      -
> tcp        0      0 10.2X.4X.132:6642       10.24.40.17:47278       ESTABLISHED -
> tcp        0      0 10.2X.4X.132:6642       10.24.40.76:36240       ESTABLISHED -
> tcp        0      0 10.2X.4X.132:6642       10.2X.4X.17:47280       ESTABLISHED -
> tcp        0      0 10.2X.4X.132:6642       10.2X.4X.6:43102        ESTABLISHED -
> tcp        0      0 10.2X.4X.132:6642       10.2X.4X.75:58890       ESTABLISHED -
> tcp        0      0 10.2X.4X.132:6642       10.2X.4X.6:43108        ESTABLISHED -
> tcp        0      0 10.2X.4X.132:6642       10.2X.4X.17:47142       ESTABLISHED -
> tcp        0      0 10.2X.4X.132:6642       10.2X.4X.71:48808       ESTABLISHED -
> tcp        0      0 10.2X.4X.132:6642       10.2X.4X.17:47096       ESTABLISHED -
> #OVN follower 2
>
> $ netstat -nap | grep 6642
>
> tcp        0      0 0.0.0.0:6642            0.0.0.0:*               LISTEN      -
> tcp        0      0 10.2X.4X.4:6642         10.2X.4X.76:57256       ESTABLISHED -
> tcp        0      0 10.2X.4X.4:6642         10.2X.4X.134:54026      ESTABLISHED -
> tcp        0      0 10.2X.4X.4:6642         10.2X.4X.10:34962       ESTABLISHED -
> tcp        0      0 10.2X.4X.4:6642         10.2X.4X.6:49238        ESTABLISHED -
> tcp        0      0 10.2X.4X.4:6642         10.2X.4X.135:59972      ESTABLISHED -
> tcp        0      0 10.2X.4X.4:6642         10.2X.4X.75:40162       ESTABLISHED -
> tcp        0      0 10.2X.4X.4:39566        10.2X.4X.132:6642       ESTABLISHED -
> #OVN follower 3
>
> netstat -nap | grep 6642
>
> tcp        0      0 0.0.0.0:6642            0.0.0.0:*               LISTEN      -
> tcp        0      0 10.2X.4X.68:6642        10.2X.4X.70:40750       ESTABLISHED -
> tcp        0      0 10.2X.4X.68:6642        10.2X.4X.11:49718       ESTABLISHED -
> tcp        0      0 10.2X.4X.68:45632       10.2X.4X.132:6642       ESTABLISHED -
> tcp        0      0 10.2X.4X.68:6642        10.2X.4X.16:44816       ESTABLISHED -
> tcp        0      0 10.2X.4X.68:6642        10.2X.4X.7:45216        ESTABLISHED
>
> The issue that we are experiencing is on the neutron-server that disconnects when there is the ovn leader change (due snapshot like each 8 minutes) and reconnects to the next leader. It breaks the Openstack API when someone is trying to create a VM at the same time.
> First, is my current configuration correct? Should the leader change and break the neutron side? Or is there some missing configuration?
> I was wondering if it is possible to use a LB with VIP and this VIP balance the connections to the ovn central members and I would reconfigure on the neutron side only with the VIP and also on the ovs-controllers. Does that make sense?
>
> Thank you.
>
> Regards,
>
> Tiago Pires



-- 
Mohammed Naser
VEXXHOST, Inc.



More information about the openstack-discuss mailing list