Neutron + OVN raft cluster
Tiago Pires
tiagohp at gmail.com
Fri May 6 19:50:10 UTC 2022
Hi all,
I was checking the mail list history and this thread
https://mail.openvswitch.org/pipermail/ovsdiscuss/2018March/046438.html
caught
my attention about raft ovsdb clustering.
In my setup (OVN 20.03 and Openstack Ussuri) on the ovncontroller we have
configured
the ovnremote="tcp:10.2X.4X.4:6642,tcp:10.2X.4X.68:6642,tcp:10.2X.4X.132:6642"
with the 3 OVN central member that they are in cluster mode.
Also on the neutron ML2 side:
[ovn]
ovn_native_dhcp = True
ovn_nb_connection =
tcp:10.2X.4X.4:6641,tcp:10.2X.4X.68:6641,tcp:10.2X.4X.132:6641
ovn_sb_connection =
tcp:10.2X.4X.4:6642,tcp:10.2X.4X.68:6642,tcp:10.2X.4X.132:6642
We are experiencing an issue with Neutron when the OVN leader decide to
take a snapshot and by design another member became leader(more less every
8 minutes):
20220505T16:57:42.135Z17401raftINFOTransferring leadership to write a
snapshot.
ovsappctl t /var/run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound
4a03
Name: OVN_Southbound
Cluster ID: ca74 (ca744caf40cd4751a2f286e35ad6541c)
Server ID: 4a03 (4a0328dce9a4495ea4f10a0340fc6d19)
Address: tcp:10.2X.4X.132:6644
Status: cluster member
Role: leader
Term: 1912
Leader: self
Vote: self
Election timer: 10000
Log: [497643, 498261]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: >3d6c >4ef0 <3d6c <4ef0
Servers:
4a03 (4a03 at tcp:10.2X.4X.132:6644) (self) next_index=497874
match_index=498260
3d6c (3d6c at tcp:10.2X.4X.68:6644) next_index=498261 match_index=498260
4ef0 (4ef0 at tcp:10.2X.4X.4:6644) next_index=498261 match_index=498260
As I understood the tcp connections from the Neutron (NB) and
ovncontrollers (SB) to OVN Central are established only with the leader:
#OVN central leader
$ netstat nap  grep 6642 more
tcp 0 0 0.0.0.0:6642 0.0.0.0:* LISTEN

tcp 0 0 10.2X.4X.132:6642 10.24.40.17:47278
ESTABLISHED 
tcp 0 0 10.2X.4X.132:6642 10.24.40.76:36240
ESTABLISHED 
tcp 0 0 10.2X.4X.132:6642 10.2X.4X.17:47280
ESTABLISHED 
tcp 0 0 10.2X.4X.132:6642 10.2X.4X.6:43102
ESTABLISHED 
tcp 0 0 10.2X.4X.132:6642 10.2X.4X.75:58890
ESTABLISHED 
tcp 0 0 10.2X.4X.132:6642 10.2X.4X.6:43108
ESTABLISHED 
tcp 0 0 10.2X.4X.132:6642 10.2X.4X.17:47142
ESTABLISHED 
tcp 0 0 10.2X.4X.132:6642 10.2X.4X.71:48808
ESTABLISHED 
tcp 0 0 10.2X.4X.132:6642 10.2X.4X.17:47096
ESTABLISHED 
#OVN follower 2
$ netstat nap  grep 6642
tcp 0 0 0.0.0.0:6642 0.0.0.0:* LISTEN

tcp 0 0 10.2X.4X.4:6642 10.2X.4X.76:57256
ESTABLISHED 
tcp 0 0 10.2X.4X.4:6642 10.2X.4X.134:54026
ESTABLISHED 
tcp 0 0 10.2X.4X.4:6642 10.2X.4X.10:34962
ESTABLISHED 
tcp 0 0 10.2X.4X.4:6642 10.2X.4X.6:49238
ESTABLISHED 
tcp 0 0 10.2X.4X.4:6642 10.2X.4X.135:59972
ESTABLISHED 
tcp 0 0 10.2X.4X.4:6642 10.2X.4X.75:40162
ESTABLISHED 
tcp 0 0 10.2X.4X.4:39566 10.2X.4X.132:6642
ESTABLISHED 
#OVN follower 3
netstat nap  grep 6642
tcp 0 0 0.0.0.0:6642 0.0.0.0:* LISTEN

tcp 0 0 10.2X.4X.68:6642 10.2X.4X.70:40750
ESTABLISHED 
tcp 0 0 10.2X.4X.68:6642 10.2X.4X.11:49718
ESTABLISHED 
tcp 0 0 10.2X.4X.68:45632 10.2X.4X.132:6642
ESTABLISHED 
tcp 0 0 10.2X.4X.68:6642 10.2X.4X.16:44816
ESTABLISHED 
tcp 0 0 10.2X.4X.68:6642 10.2X.4X.7:45216
ESTABLISHED
The issue that we are experiencing is on the neutronserver that
disconnects when there is the ovn leader change (due snapshot like each 8
minutes) and reconnects to the next leader. It breaks the Openstack API
when someone is trying to create a VM at the same time.
First, is my current configuration correct? Should the leader change and
break the neutron side? Or is there some missing configuration?
I was wondering if it is possible to use a LB with VIP and this VIP balance
the connections to the ovn central members and I would reconfigure on the
neutron side only with the VIP and also on the ovscontrollers. Does that
make sense?
Thank you.
Regards,
Tiago Pires
 next part 
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstackdiscuss/attachments/20220506/b8681107/attachment.htm>
More information about the openstackdiscuss
mailing list