[wallaby][neutron][ovn] Low network performance between instances on different compute nodes using OVN Geneve tunnels

Laurent Dumont laurentfdumont at gmail.com
Tue Jul 6 00:18:33 UTC 2021


I think that you should be able to reach line rate as well for 10Gbit on
Geneve/OVN. I don't have a setup to compare, but you might want to try to
force the TCP Window with iperf3.

There could be a case where the PPS (packet per second) is the issue and it
cannot reach a sufficiently big window.

On Mon, Jul 5, 2021 at 2:19 AM Malik Obaid <malikobaidadil at gmail.com> wrote:

> Hi Laurent,
>
> I am using 32 cores and 32GB RAM on VM. The compute nodes are EPYC 7532
> dual socket baremetal servers with 1TB RAM with ubuntu 20.04 and network
> cards are Broadcom BCM57504 NetXtreme-E 10Gb cards.
>
> Below are the stats on different hosts.
>
> TCP bidirectional, on geneve network.
>
> [ ID][Role] Interval           Transfer     Bitrate         Retr
> [  5][RX-S]   0.00-10.01  sec  3.44 GBytes  2.95 Gbits/sec
>  receiver
> [  8][TX-S]   0.00-10.01  sec  3.48 GBytes  2.99 Gbits/sec    0
>   sender
>
> Unidirectional tcp, on geneve network.
>
> [ ID] Interval           Transfer     Bitrate         Retr
> [  5]   0.00-10.00  sec  6.29 GBytes  5.41 Gbits/sec  491
> sender
> [  5]   0.00-10.00  sec  6.29 GBytes  5.40 Gbits/sec
>  receiver
>
> Unidirectional udp, on geneve network.
>
> [ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total
> Datagrams
> [  5]   0.00-10.00  sec  3.45 GBytes  2.97 Gbits/sec  0.000 ms  0/2540389
> (0%)  sender
> [  5]   0.00-10.00  sec  3.23 GBytes  2.77 Gbits/sec  0.009 ms
>  165868/2539198 (6.5%)  receiver
>
> Below are the stats of bidirectional udp, on geneve network.
>
> [ ID][Role] Interval           Transfer     Bitrate         Jitter
>  Lost/Total Datagrams
> [  5][TX-C]   0.00-10.00  sec  2.00 GBytes  1.72 Gbits/sec  0.000 ms
>  0/1472357 (0%)  sender
> [  5][TX-C]   0.00-10.01  sec  1.99 GBytes  1.71 Gbits/sec  0.024 ms
>  7713/1471535 (0.52%)  receiver
> [  7][RX-C]   0.00-10.00  sec  2.00 GBytes  1.72 Gbits/sec  0.000 ms
>  0/1472450 (0%)  sender
> [  7][RX-C]   0.00-10.01  sec  1.98 GBytes  1.70 Gbits/sec  0.012 ms
>  17325/1470552 (1.2%)  receiver
>
> ==================================================
>
> Below are the stats of VMs on same host.
>
> tcp unidirectional on geneve network.
>
> [ ID] Interval           Transfer     Bitrate         Retr
> [  5]   0.00-10.00  sec  19.5 GBytes  16.7 Gbits/sec    0
> sender
> [  5]   0.00-10.00  sec  19.5 GBytes  16.7 Gbits/sec
>  receiver
>
> tcp bidirectional on geneve network.
>
> [ ID][Role] Interval           Transfer     Bitrate         Retr
> [  5][TX-C]   0.00-10.00  sec  10.7 GBytes  9.21 Gbits/sec    0
>   sender
> [  5][TX-C]   0.00-10.00  sec  10.7 GBytes  9.21 Gbits/sec
>  receiver
> [  7][RX-C]   0.00-10.00  sec  9.95 GBytes  8.55 Gbits/sec    0
>   sender
> [  7][RX-C]   0.00-10.00  sec  9.95 GBytes  8.54 Gbits/sec
>  receiver
>
> udp unidirectional on geneve network.
>
> [ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total
> Datagrams
> [  5]   0.00-10.00  sec  2.15 GBytes  1.85 Gbits/sec  0.000 ms  0/1584825
> (0%)  sender
> [  5]   0.00-10.00  sec  2.15 GBytes  1.85 Gbits/sec  0.015 ms  0/1584825
> (0%)  receiver
>
> udp bidirectional on geneve network.
>
> [ ID][Role] Interval           Transfer     Bitrate         Jitter
>  Lost/Total Datagrams
> [  5][TX-C]   0.00-10.00  sec  2.17 GBytes  1.87 Gbits/sec  0.000 ms
>  0/1597563 (0%)  sender
> [  5][TX-C]   0.00-10.00  sec  1.37 GBytes  1.17 Gbits/sec  0.006 ms
>  590524/1595459 (37%)  receiver
> [  7][RX-C]   0.00-10.00  sec  1.37 GBytes  1.17 Gbits/sec  0.000 ms
>  0/1005024 (0%)  sender
> [  7][RX-C]   0.00-10.00  sec  1.37 GBytes  1.17 Gbits/sec  0.012 ms
>  0/1004983 (0%)  receiver
>
> However the performance of network on OVN VLAN provider network is
> ~9.8Gbps bidirectional with VMs on different hosts.
>
> Below are the details of ovs-vsctl show command on compute node.
>
>     Bridge br-int
>         fail_mode: secure
>         datapath_type: system
>         Port tap131797c7-06
>             Interface tap131797c7-06
>         Port ovn-094381-0
>             Interface ovn-094381-0
>                 type: geneve
>                 options: {csum="true", key=flow, remote_ip="172.16.40.2"}
>                 bfd_status: {diagnostic="No Diagnostic", flap_count="1",
> forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up,
> state=up}
>         Port patch-br-int-to-provnet-4f09b820-b5c1-4006-bb6a-a14ecf7f776d
>             Interface
> patch-br-int-to-provnet-4f09b820-b5c1-4006-bb6a-a14ecf7f776d
>                 type: patch
>                 options:
> {peer=patch-provnet-4f09b820-b5c1-4006-bb6a-a14ecf7f776d-to-br-int}
>         Port br-int
>             Interface br-int
>                 type: internal
>         Port tap18ca5a79-10
>             Interface tap18ca5a79-10
>     Bridge br-vlan
>         Port bond0
>             Interface bond0
>         Port patch-provnet-4f09b820-b5c1-4006-bb6a-a14ecf7f776d-to-br-int
>             Interface
> patch-provnet-4f09b820-b5c1-4006-bb6a-a14ecf7f776d-to-br-int
>                 type: patch
>                 options:
> {peer=patch-br-int-to-provnet-4f09b820-b5c1-4006-bb6a-a14ecf7f776d}
>         Port br-vlan
>             Interface br-vlan
>                 type: internal
>     ovs_version: "2.15.0"
>
> I have already tuned the BIOS for max performance. Any tuning required at
> OVS or OS level. I strongly believe that throughput should be 10Gbps
> without dpdk on geneve network.
>
> Regards,
> Malik Obaid
>
> On Sun, Jul 4, 2021 at 9:50 PM Laurent Dumont <laurentfdumont at gmail.com>
> wrote:
>
>> Nothing super specific I can think of but :
>>
>>    - Can you try running the same tests with two instances on the same
>>    compute?
>>    - How many cores are inside the sender/receiver VM?
>>    - Can you test in UDP mode?
>>
>>
>>
>> On Sun, Jul 4, 2021 at 8:27 AM Malik Obaid <malikobaidadil at gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I am using Openstack Wallaby release with OVN on Ubuntu 20.04.
>>>
>>> My environment consists of 2 compute nodes and 1 controller node.
>>> ovs_version: "2.15.0"
>>> Ubuntu Kernel Version: 5.4.0-77-generic
>>>
>>>
>>> I am observing Network performance between instances on different
>>> compute nodes is slow. The network uses geneve tunnels.The environment is
>>> using 10Gbps network interface cards. However, iperf between instances on
>>> different compute nodes attains only speeds between a few hundred Mbit/s
>>> and a few Gb/s. Both instances are in the same tenant network.
>>>
>>> Note: iperf results between both compute nodes (hypervisors) across the
>>> geneve tunnel endpoints is perfect 10 Gbps.
>>>
>>> Below are the results of iperf commands.
>>>
>>> *iperf server:*
>>>
>>> 2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8950 qdisc fq_codel state
>>> UP group default qlen 1000
>>>     link/ether fa:16:3e:4b:1d:29 brd ff:ff:ff:ff:ff:ff
>>>     inet 192.168.100.111/24 brd 192.168.100.255 scope global dynamic
>>> ens3
>>>        valid_lft 42694sec preferred_lft 42694sec
>>>     inet6 fe80::f816:3eff:fe4b:1d29/64 scope link
>>>        valid_lft forever preferred_lft forever
>>>
>>> root at vm-01:~# iperf3 -s
>>> Server listening on 5201
>>>
>>> Accepted connection from 192.168.100.69, port 45542
>>> [  5] local 192.168.100.111 port 5201 connected to 192.168.100.69 port
>>> 45544
>>> [  8] local 192.168.100.111 port 5201 connected to 192.168.100.69 port
>>> 45546
>>> [ ID][Role] Interval           Transfer     Bitrate         Retr  Cwnd
>>> [  5][RX-S]   0.00-1.00   sec   692 MBytes  5.81 Gbits/sec
>>> [  8][TX-S]   0.00-1.00   sec   730 MBytes  6.12 Gbits/sec    0   3.14
>>> MBytes
>>> [  5][RX-S]   1.00-2.00   sec   598 MBytes  5.01 Gbits/sec
>>> [  8][TX-S]   1.00-2.00   sec   879 MBytes  7.37 Gbits/sec    0   3.14
>>> MBytes
>>> [  5][RX-S]   2.00-3.00   sec   793 MBytes  6.65 Gbits/sec
>>> [  8][TX-S]   2.00-3.00   sec   756 MBytes  6.34 Gbits/sec    0   3.14
>>> MBytes
>>> [  5][RX-S]   3.00-4.00   sec   653 MBytes  5.48 Gbits/sec
>>> [  8][TX-S]   3.00-4.00   sec   871 MBytes  7.31 Gbits/sec    0   3.14
>>> MBytes
>>> [  5][RX-S]   4.00-5.00   sec   597 MBytes  5.01 Gbits/sec
>>> [  8][TX-S]   4.00-5.00   sec   858 MBytes  7.20 Gbits/sec    0   3.14
>>> MBytes
>>> [  5][RX-S]   5.00-6.00   sec   734 MBytes  6.16 Gbits/sec
>>> [  8][TX-S]   5.00-6.00   sec   818 MBytes  6.86 Gbits/sec    0   3.14
>>> MBytes
>>> [  5][RX-S]   6.00-7.00   sec   724 MBytes  6.06 Gbits/sec
>>> [  8][TX-S]   6.00-7.00   sec   789 MBytes  6.60 Gbits/sec    0   3.14
>>> MBytes
>>> [  5][RX-S]   7.00-8.00   sec   735 MBytes  6.18 Gbits/sec
>>> [  8][TX-S]   7.00-8.00   sec   835 MBytes  7.02 Gbits/sec    0   3.14
>>> MBytes
>>> [  5][RX-S]   8.00-9.00   sec   789 MBytes  6.62 Gbits/sec
>>> [  8][TX-S]   8.00-9.00   sec   845 MBytes  7.09 Gbits/sec    0   3.14
>>> MBytes
>>> [  5][RX-S]   9.00-10.00  sec   599 MBytes  5.02 Gbits/sec
>>> [  8][TX-S]   9.00-10.00  sec   806 MBytes  6.76 Gbits/sec    0   3.14
>>> MBytes
>>>
>>> [ ID][Role] Interval           Transfer     Bitrate         Retr
>>> [  5][RX-S]   0.00-10.00  sec  6.75 GBytes  5.80 Gbits/sec
>>>    receiver
>>> [ 8][TX-S] 0.00-10.00 sec 7.99 GBytes 6.87 Gbits/sec 0 sender
>>>
>>> Server listening on 5201
>>>
>>> *Client side:*
>>>
>>> root at vm-03:~# iperf3 -c 192.168.100.111 --bidir
>>> Connecting to host 192.168.100.111, port 5201
>>> [  5] local 192.168.100.69 port 45544 connected to 192.168.100.111 port
>>> 5201
>>> [  7] local 192.168.100.69 port 45546 connected to 192.168.100.111 port
>>> 5201
>>> [ ID][Role] Interval           Transfer     Bitrate         Retr  Cwnd
>>> [  5][TX-C]   0.00-1.00   sec   700 MBytes  5.87 Gbits/sec    0   3.13
>>> MBytes
>>> [  7][RX-C]   0.00-1.00   sec   722 MBytes  6.06 Gbits/sec
>>> [  5][TX-C]   1.00-2.00   sec   594 MBytes  4.98 Gbits/sec    0   3.13
>>> MBytes
>>> [  7][RX-C]   1.00-2.00   sec   883 MBytes  7.41 Gbits/sec
>>> [  5][TX-C]   2.00-3.00   sec   796 MBytes  6.67 Gbits/sec    0   3.13
>>> MBytes
>>> [  7][RX-C]   2.00-3.00   sec   752 MBytes  6.31 Gbits/sec
>>> [  5][TX-C]   3.00-4.00   sec   654 MBytes  5.49 Gbits/sec    0   3.13
>>> MBytes
>>> [  7][RX-C]   3.00-4.00   sec   876 MBytes  7.35 Gbits/sec
>>> [  5][TX-C]   4.00-5.00   sec   598 MBytes  5.01 Gbits/sec    0   3.13
>>> MBytes
>>> [  7][RX-C]   4.00-5.00   sec   853 MBytes  7.16 Gbits/sec
>>> [  5][TX-C]   5.00-6.00   sec   734 MBytes  6.15 Gbits/sec    0   3.13
>>> MBytes
>>> [  7][RX-C]   5.00-6.00   sec   818 MBytes  6.86 Gbits/sec
>>> [  5][TX-C]   6.00-7.00   sec   726 MBytes  6.09 Gbits/sec    0   3.13
>>> MBytes
>>> [  7][RX-C]   6.00-7.00   sec   793 MBytes  6.65 Gbits/sec
>>> [  5][TX-C]   7.00-8.00   sec   734 MBytes  6.15 Gbits/sec    0   3.13
>>> MBytes
>>> [  7][RX-C]   7.00-8.00   sec   831 MBytes  6.97 Gbits/sec
>>> [  5][TX-C]   8.00-9.00   sec   788 MBytes  6.61 Gbits/sec    0   3.13
>>> MBytes
>>> [  7][RX-C]   8.00-9.00   sec   845 MBytes  7.09 Gbits/sec
>>> [  5][TX-C]   9.00-10.00  sec   600 MBytes  5.03 Gbits/sec    0   3.13
>>> MBytes
>>> [  7][RX-C]   9.00-10.00  sec   805 MBytes  6.76 Gbits/sec
>>>
>>> [ ID][Role] Interval           Transfer     Bitrate         Retr
>>> [  5][TX-C]   0.00-10.00  sec  6.76 GBytes  5.81 Gbits/sec    0
>>>     sender
>>> [  5][TX-C]   0.00-10.00  sec  6.75 GBytes  5.80 Gbits/sec
>>>    receiver
>>> [  7][RX-C]   0.00-10.00  sec  7.99 GBytes  6.87 Gbits/sec    0
>>>     sender
>>> [  7][RX-C]   0.00-10.00  sec  7.99 GBytes  6.86 Gbits/sec
>>>    receiver
>>>
>>> iperf Done.
>>>
>>>
>>> ---------------------------------------------------------------------------------------------------------
>>>
>>> *ovs-vsctl show on compute node1:*
>>>
>>> root at kvm01-a1-khi01:~# ovs-vsctl show
>>> 88e6b984-44dc-4f74-8a9a-891742dbbdbd
>>>     Bridge br-eth1
>>>         Port ens224
>>>             Interface ens224
>>>         Port patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int
>>>             Interface
>>> patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int
>>>                 type: patch
>>>                 options:
>>> {peer=patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9}
>>>         Port br-eth1
>>>             Interface br-eth1
>>>                 type: internal
>>>     Bridge br-int
>>>         fail_mode: secure
>>>         datapath_type: system
>>>         Port tapde98b2d4-a0
>>>             Interface tapde98b2d4-a0
>>>         Port ovn-f51ef9-0
>>>             Interface ovn-f51ef9-0
>>>                 type: vxlan
>>>                 options: {csum="true", key=flow, remote_ip="172.16.30.3"}
>>>                 bfd_status: {diagnostic="No Diagnostic", flap_count="1",
>>> forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up,
>>> state=up}
>>>         Port tap348fc6dc-3a
>>>             Interface tap348fc6dc-3a
>>>         Port br-int
>>>             Interface br-int
>>>                 type: internal
>>>         Port tap6d4d8e02-c0
>>>             Interface tap6d4d8e02-c0
>>>                 error: "could not open network device tap6d4d8e02-c0 (No
>>> such device)"
>>>         Port patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9
>>>             Interface
>>> patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9
>>>                 type: patch
>>>                 options:
>>> {peer=patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int}
>>>         Port tap247fe5b2-ff
>>>             Interface tap247fe5b2-ff
>>>
>>>
>>> ------------------------------------------------------------------------------------------------------
>>>
>>> *ovs-vsctl show on compute node2:*
>>>
>>> root at kvm03-a1-khi01:~# ovs-vsctl show
>>> 24ce6475-89bb-4df5-a5ff-4ce58f2c2f68
>>>     Bridge br-eth1
>>>         Port patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int
>>>             Interface
>>> patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int
>>>                 type: patch
>>>                 options:
>>> {peer=patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9}
>>>         Port br-eth1
>>>             Interface br-eth1
>>>                 type: internal
>>>         Port ens224
>>>             Interface ens224
>>>     Bridge br-int
>>>         fail_mode: secure
>>>         datapath_type: system
>>>         Port patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9
>>>             Interface
>>> patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9
>>>                 type: patch
>>>                 options:
>>> {peer=patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int}
>>>         Port tap2b0bbf7b-59
>>>             Interface tap2b0bbf7b-59
>>>         Port ovn-650be8-0
>>>             Interface ovn-650be8-0
>>>                 type: vxlan
>>>                 options: {csum="true", key=flow, remote_ip="172.16.30.1"}
>>>                 bfd_status: {diagnostic="No Diagnostic", flap_count="1",
>>> forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up,
>>> state=up}
>>>         Port tap867d2174-83
>>>             Interface tap867d2174-83
>>>         Port tapde98b2d4-a0
>>>             Interface tapde98b2d4-a0
>>>         Port br-int
>>>             Interface br-int
>>>                 type: internal
>>>
>>>
>>> --------------------------------------------------------------------------------------------------------
>>>
>>> I would really appreciate any input in this regard.
>>>
>>> Thank you.
>>>
>>> Regards,
>>> Malik Obaid
>>>
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210705/e738acba/attachment-0001.html>


More information about the openstack-discuss mailing list