[openstack][neutron][openvswitch] Openvswitch Packet loss when high throughput (pps)

Ha Noi hanoi952022 at gmail.com
Sun Sep 10 08:23:09 UTC 2023


Thanks Smooney,

I'm trying to test performance between 2 VMs using DPDK. And the high
latency does not appear any more. But the packet is still lost.

I will tune my system to get the highest throughput.

thanks you guys

On Fri, Sep 8, 2023 at 9:20 PM <smooney at redhat.com> wrote:

> On Thu, 2023-09-07 at 22:05 -0400, Satish Patel wrote:
> > Do one thing, use test-pmd base benchmark and see because test-pmd
> > application is DPDK aware. with test-pmd you will have a 1000% better
> > performance :)
>
> actully test-pmd is not DPDK aware
> its a dpdk applciation so it is faster because it remove the overhead of
> kernel networking in the guest
> not because it has any dpdk awareness. testpmd cannot tell that ovs-dpdk
> is in use
> form a gust perspecive you cannot tell if your using ovs-dpdk or kernel
> ovs as there is no viable diffence
> in the virtio-net-pci device which  is presented to the guest kernel by
> qemu.
>
> iper3 with a single core cant actully saturate a virtio-net-interface when
> its backed
> by vhost-user/dpdk or something like a macvtap sriov port.
> you can reach line rate with larger packet sizes or multipel cores but
> if you wanted too test small packet io then testpmd, dpdk packetgen  or
> tgen
> are better tools in that regard. they can eaiclly saturate a link into the
> 10s of gibitits per second using
> 64byte packets.
>
> >
> > On Thu, Sep 7, 2023 at 9:59 PM Ha Noi <hanoi952022 at gmail.com> wrote:
> >
> > > I run the performance test using iperf3. But the performance is not
> > > increased as theory. I don't know which configuration is not correct.
> > >
> > > On Fri, Sep 8, 2023 at 8:57 AM Satish Patel <satish.txt at gmail.com>
> wrote:
> > >
> > > > I would say let's run your same benchmark with OVS-DPDK and tell me
> if
> > > > you see better performance. I doubt you will see significant
> performance
> > > > boot but lets see. Please prove me wrong :)
> > > >
> > > > On Thu, Sep 7, 2023 at 9:45 PM Ha Noi <hanoi952022 at gmail.com> wrote:
> > > >
> > > > > Hi Satish,
> > > > >
> > > > > Actually, the guess interface is not using tap anymore.
> > > > >
> > > > >     <interface type='vhostuser'>
> > > > >       <mac address='fa:16:3e:76:77:dd'/>
> > > > >       <source type='unix'
> path='/var/run/openvswitch/vhu3766ee8a-86'
> > > > > mode='server'/>
> > > > >       <target dev='vhu3766ee8a-86'/>
> > > > >       <model type='virtio'/>
> > > > >       <alias name='net0'/>
> > > > >       <address type='pci' domain='0x0000' bus='0x00' slot='0x03'
> > > > > function='0x0'/>
> > > > >     </interface>
> > > > >
> > > > > It's totally bypass the kernel stack ?
> yep dpdk is userspace networkign and it gets its performace boost form that
> so the data is "trasnproted" by doding a direct mmap of the virtio ring
> buffers
> between the DPDK pool mode driver and the qemu process.
>
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Sep 8, 2023 at 5:02 AM Satish Patel <satish.txt at gmail.com>
> > > > > wrote:
> > > > >
> > > > > > I did test OVS-DPDK and it helps offload the packet process on
> compute
> > > > > > nodes, But what about VMs it will still use a tap interface to
> attach from
> > > > > > compute to vm and bottleneck will be in vm. I strongly believe
> that we have
> > > > > > to run DPDK based guest to pass through the kernel stack.
> > > > > >
> > > > > > I love to hear from other people if I am missing something here.
> > > > > >
> > > > > > On Thu, Sep 7, 2023 at 5:27 PM Ha Noi <hanoi952022 at gmail.com>
> wrote:
> > > > > >
> > > > > > > Oh. I heard from someone on the reddit said that Ovs-dpdk is
> > > > > > > transparent with user?
> > > > > > >
> > > > > > > So It’s not correct?
> > > > > > >
> > > > > > > On Thu, 7 Sep 2023 at 22:13 Satish Patel <satish.txt at gmail.com>
> wrote:
> > > > > > >
> > > > > > > > Because DPDK required DPDK support inside guest VM. It's not
> > > > > > > > suitable for general purpose workload. You need your guest
> VM network to
> > > > > > > > support DPDK to get 100% throughput.
> > > > > > > >
> > > > > > > > On Thu, Sep 7, 2023 at 8:06 AM Ha Noi <hanoi952022 at gmail.com>
> wrote:
> > > > > > > >
> > > > > > > > > Hi Satish,
> > > > > > > > >
> > > > > > > > > Why dont you use DPDK?
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > On Thu, 7 Sep 2023 at 19:03 Satish Patel <
> satish.txt at gmail.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I totally agreed with Sean on all his points but trust
> me, I have
> > > > > > > > > > tried everything possible to tune OS, Network stack,
> multi-queue, NUMA, CPU
> > > > > > > > > > pinning and name it.. but I didn't get any significant
> improvement. You may
> > > > > > > > > > gain 2 to 5% gain with all those tweek. I am running the
> entire workload on
> > > > > > > > > > sriov and life is happy except no LACP bonding.
> > > > > > > > > >
> > > > > > > > > > I am very interesting is this project
> > > > > > > > > >
> https://docs.openvswitch.org/en/latest/intro/install/afxdp/
> > > > > > > > > >
> > > > > > > > > > On Thu, Sep 7, 2023 at 6:07 AM Ha Noi <
> hanoi952022 at gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Dear Smoney,
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Sep 7, 2023 at 12:41 AM <smooney at redhat.com>
> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > On Wed, 2023-09-06 at 11:43 -0400, Satish Patel
> wrote:
> > > > > > > > > > > > > Damn! We have noticed the same issue around 40k to
> 55k PPS.
> > > > > > > > > > > > Trust me
> > > > > > > > > > > > > nothing is wrong in your config. This is just a
> limitation of
> > > > > > > > > > > > the software
> > > > > > > > > > > > > stack and kernel itself.
> > > > > > > > > > > > its partly determined by your cpu frequency.
> > > > > > > > > > > > kernel ovs of yesteryear could handel about 1mpps
> total on a ~4GHZ
> > > > > > > > > > > > cpu. with per port troughpuyt being lower dependin
> on what
> > > > > > > > > > > > qos/firewall
> > > > > > > > > > > > rules that were apllied.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > My CPU frequency is 3Ghz and using CPU Intel Gold 2nd
> generation.
> > > > > > > > > > > I think the problem is tuning in the compute node
> inside. But I cannot find
> > > > > > > > > > > any guide or best practices for it.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > moving form iptables firewall to ovs firewall can
> help to some
> > > > > > > > > > > > degree
> > > > > > > > > > > > but your partly trading connection setup time for
> statead state
> > > > > > > > > > > > troughput
> > > > > > > > > > > > with the overhead of the connection tracker in ovs.
> > > > > > > > > > > >
> > > > > > > > > > > > using stateless security groups can help
> > > > > > > > > > > >
> > > > > > > > > > > > we also recently fixed a regression cause by changes
> in newer
> > > > > > > > > > > > versions of ovs.
> > > > > > > > > > > > this was notable in goign form rhel 8 to rhel 9
> where litrally it
> > > > > > > > > > > > reduced
> > > > > > > > > > > > small packet performce to 1/10th and jumboframes to
> about 1/2
> > > > > > > > > > > > on master we have a config option that will set the
> default qos
> > > > > > > > > > > > on a port to linux-noop
> > > > > > > > > > > >
> > > > > > > > > > > >
> https://github.com/openstack/os-vif/blob/master/vif_plug_ovs/ovs.py#L106-L125
> > > > > > > > > > > >
> > > > > > > > > > > > the backports are propsoed upstream
> > > > > > > > > > > >
> https://review.opendev.org/q/Id9ef7074634a0f23d67a4401fa8fca363b51bb43
> > > > > > > > > > > > and we have backported this downstream to adress
> that performance
> > > > > > > > > > > > regression.
> > > > > > > > > > > > the upstram backport is semi stalled just ebcasue we
> wanted to
> > > > > > > > > > > > disucss if we shoudl make ti opt in
> > > > > > > > > > > > by default upstream while backporting but it might
> be helpful for
> > > > > > > > > > > > you if this is related to yoru current
> > > > > > > > > > > > issues.
> > > > > > > > > > > >
> > > > > > > > > > > > 40-55 kpps is kind of low for kernel ovs but if you
> have a low
> > > > > > > > > > > > clockrate cpu, hybrid_plug + incorrect qos
> > > > > > > > > > > > then i could see you hitting such a bottelneck.
> > > > > > > > > > > >
> > > > > > > > > > > > one workaround by the way without the os-vif
> workaround
> > > > > > > > > > > > backported is to set
> > > > > > > > > > > > /proc/sys/net/core/default_qdisc to not apply any
> qos or a low
> > > > > > > > > > > > overhead qos type
> > > > > > > > > > > > i.e. sudo sysctl -w net.core.default_qdisc=pfifo_fast
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > that may or may not help but i would ensure that
> your are not
> > > > > > > > > > > > usign somting like fqdel or cake
> > > > > > > > > > > > for net.core.default_qdisc and if you are try
> changing it to
> > > > > > > > > > > > pfifo_fast and see if that helps.
> > > > > > > > > > > >
> > > > > > > > > > > > there isnet much you can do about the cpu clock rate
> but ^ is
> > > > > > > > > > > > somethign you can try for free
> > > > > > > > > > > > note it wont actully take effect on an exsitng vm if
> you jsut
> > > > > > > > > > > > change the default but you can use
> > > > > > > > > > > > tc to also chagne the qdisk for testing. hard
> rebooting the vm
> > > > > > > > > > > > shoudl also make the default take effect.
> > > > > > > > > > > >
> > > > > > > > > > > > the only other advice i can give assuming kernel ovs
> is the only
> > > > > > > > > > > > option you have is
> > > > > > > > > > > >
> > > > > > > > > > > > to look at
> > > > > > > > > > > >
> > > > > > > > > > > >
> https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.rx_queue_size
> > > > > > > > > > > >
> > > > > > > > > > > >
> https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.tx_queue_size
> > > > > > > > > > > > and
> > > > > > > > > > > >
> > > > > > > > > > > >
> https://docs.openstack.org/nova/latest/configuration/extra-specs.html#hw:vif_multiqueue_enabled
> > > > > > > > > > > >
> > > > > > > > > > > > if the bottelneck is actully in qemu or the guest
> kernel rather
> > > > > > > > > > > > then ovs adjusting the rx/tx queue size and
> > > > > > > > > > > > using multi queue can help. it will have no effect
> if ovs is the
> > > > > > > > > > > > bottel neck.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > I have set this option to 1024, and enable multiqueue
> as well. But
> > > > > > > > > > > it did not help.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Sep 6, 2023 at 9:21 AM Ha Noi <
> hanoi952022 at gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Satish,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Actually, our customer get this issue when the
> tx/rx above
> > > > > > > > > > > > only 40k pps.
> > > > > > > > > > > > > > So what is the threshold of this throughput for
> OvS?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks and regards
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, 6 Sep 2023 at 20:19 Satish Patel <
> > > > > > > > > > > > satish.txt at gmail.com> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > This is normal because OVS or LinuxBridge wire
> up VMs using
> > > > > > > > > > > > TAP interface
> > > > > > > > > > > > > > > which runs on kernel space and that drives
> higher interrupt
> > > > > > > > > > > > and that makes
> > > > > > > > > > > > > > > the kernel so busy working on handling
> packets. Standard
> > > > > > > > > > > > OVS/LinuxBridge
> > > > > > > > > > > > > > > are not meant for higher PPS.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > If you want to handle higher PPS then look for
> DPDK or
> > > > > > > > > > > > SRIOV deployment.
> > > > > > > > > > > > > > > ( We are running everything in SRIOV because
> of high PPS
> > > > > > > > > > > > requirement)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Tue, Sep 5, 2023 at 11:11 AM Ha Noi <
> > > > > > > > > > > > hanoi952022 at gmail.com> wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I'm using Openstack Train and Openvswitch
> for ML2 driver
> > > > > > > > > > > > and GRE for
> > > > > > > > > > > > > > > > tunnel type. I tested our network
> performance between two
> > > > > > > > > > > > VMs and suffer
> > > > > > > > > > > > > > > > packet loss as below.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > VM1: IP: 10.20.1.206
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > VM2: IP: 10.20.1.154 <https://10.20.1.154/24
> >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > VM3: IP: 10.20.1.72
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Using iperf3 to testing performance between
> VM1 and VM2.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Run iperf3 client and server on both VMs.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On VM2: iperf3 -t 10000 -b 130M -l 442 -P 6
> -u -c
> > > > > > > > > > > > 10.20.1.206
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On VM1: iperf3 -t 10000 -b 130M -l 442 -P 6
> -u -c
> > > > > > > > > > > > 10.20.1.154
> > > > > > > > > > > > > > > > <https://10.20.1.154/24>
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Using VM3 ping into VM1, then the packet is
> lost and the
> > > > > > > > > > > > latency is
> > > > > > > > > > > > > > > > quite high.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > ping -i 0.1 10.20.1.206
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > PING 10.20.1.206 (10.20.1.206) 56(84) bytes
> of data.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=1
> ttl=64 time=7.70 ms
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=2
> ttl=64 time=6.90 ms
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=3
> ttl=64 time=7.71 ms
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=4
> ttl=64 time=7.98 ms
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=6
> ttl=64 time=8.58 ms
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=7
> ttl=64 time=8.34 ms
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=8
> ttl=64 time=8.09 ms
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=10
> ttl=64 time=4.57
> > > > > > > > > > > > ms
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=11
> ttl=64 time=8.74
> > > > > > > > > > > > ms
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=12
> ttl=64 time=9.37
> > > > > > > > > > > > ms
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=14
> ttl=64 time=9.59
> > > > > > > > > > > > ms
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=15
> ttl=64 time=7.97
> > > > > > > > > > > > ms
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=16
> ttl=64 time=8.72
> > > > > > > > > > > > ms
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=17
> ttl=64 time=9.23
> > > > > > > > > > > > ms
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > ^C
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > --- 10.20.1.206 ping statistics ---
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 34 packets transmitted, 28 received,
> 17.6471% packet
> > > > > > > > > > > > loss, time 3328ms
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > rtt min/avg/max/mdev =
> 1.396/6.266/9.590/2.805 ms
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Does any one get this issue ?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Please help me. Thanks
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230910/9933c719/attachment-0001.htm>


More information about the openstack-discuss mailing list