<div dir="ltr">Thanks Smooney,<div><br></div><div>I'm trying to test performance between 2 VMs using DPDK. And the high latency does not appear any more. But the packet is still lost. </div><div><br></div><div>I will tune my system to get the highest throughput.</div><div><br></div><div>thanks you guys</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Sep 8, 2023 at 9:20 PM <<a href="mailto:smooney@redhat.com">smooney@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Thu, 2023-09-07 at 22:05 -0400, Satish Patel wrote:<br>

> Do one thing, use test-pmd base benchmark and see because test-pmd<br>

> application is DPDK aware. with test-pmd you will have a 1000% better<br>

> performance :)<br>

<br>

actully test-pmd is not DPDK aware<br>

its a dpdk applciation so it is faster because it remove the overhead of kernel networking in the guest<br>

not because it has any dpdk awareness. testpmd cannot tell that ovs-dpdk is in use<br>

form a gust perspecive you cannot tell if your using ovs-dpdk or kernel ovs as there is no viable diffence<br>

in the virtio-net-pci device which  is presented to the guest kernel by qemu.<br>

<br>

iper3 with a single core cant actully saturate a virtio-net-interface when its backed<br>

by vhost-user/dpdk or something like a macvtap sriov port.<br>

you can reach line rate with larger packet sizes or multipel cores but<br>

if you wanted too test small packet io then testpmd, dpdk packetgen  or tgen<br>

are better tools in that regard. they can eaiclly saturate a link into the 10s of gibitits per second using<br>

64byte packets.<br>

<br>

> <br>

> On Thu, Sep 7, 2023 at 9:59 PM Ha Noi <<a href="mailto:hanoi952022@gmail.com" target="_blank">hanoi952022@gmail.com</a>> wrote:<br>

> <br>

> > I run the performance test using iperf3. But the performance is not<br>

> > increased as theory. I don't know which configuration is not correct.<br>

> > <br>

> > On Fri, Sep 8, 2023 at 8:57 AM Satish Patel <<a href="mailto:satish.txt@gmail.com" target="_blank">satish.txt@gmail.com</a>> wrote:<br>

> > <br>

> > > I would say let's run your same benchmark with OVS-DPDK and tell me if<br>

> > > you see better performance. I doubt you will see significant performance<br>

> > > boot but lets see. Please prove me wrong :)<br>

> > > <br>

> > > On Thu, Sep 7, 2023 at 9:45 PM Ha Noi <<a href="mailto:hanoi952022@gmail.com" target="_blank">hanoi952022@gmail.com</a>> wrote:<br>

> > > <br>

> > > > Hi Satish,<br>

> > > > <br>

> > > > Actually, the guess interface is not using tap anymore.<br>

> > > > <br>

> > > >     <interface type='vhostuser'><br>

> > > >       <mac address='fa:16:3e:76:77:dd'/><br>

> > > >       <source type='unix' path='/var/run/openvswitch/vhu3766ee8a-86'<br>

> > > > mode='server'/><br>

> > > >       <target dev='vhu3766ee8a-86'/><br>

> > > >       <model type='virtio'/><br>

> > > >       <alias name='net0'/><br>

> > > >       <address type='pci' domain='0x0000' bus='0x00' slot='0x03'<br>

> > > > function='0x0'/><br>

> > > >     </interface><br>

> > > > <br>

> > > > It's totally bypass the kernel stack ?<br>

yep dpdk is userspace networkign and it gets its performace boost form that<br>

so the data is "trasnproted" by doding a direct mmap of the virtio ring buffers<br>

between the DPDK pool mode driver and the qemu process.<br>

<br>

> > > > <br>

> > > > <br>

> > > > <br>

> > > > <br>

> > > > On Fri, Sep 8, 2023 at 5:02 AM Satish Patel <<a href="mailto:satish.txt@gmail.com" target="_blank">satish.txt@gmail.com</a>><br>

> > > > wrote:<br>

> > > > <br>

> > > > > I did test OVS-DPDK and it helps offload the packet process on compute<br>

> > > > > nodes, But what about VMs it will still use a tap interface to attach from<br>

> > > > > compute to vm and bottleneck will be in vm. I strongly believe that we have<br>

> > > > > to run DPDK based guest to pass through the kernel stack.<br>

> > > > > <br>

> > > > > I love to hear from other people if I am missing something here.<br>

> > > > > <br>

> > > > > On Thu, Sep 7, 2023 at 5:27 PM Ha Noi <<a href="mailto:hanoi952022@gmail.com" target="_blank">hanoi952022@gmail.com</a>> wrote:<br>

> > > > > <br>

> > > > > > Oh. I heard from someone on the reddit said that Ovs-dpdk is<br>

> > > > > > transparent with user?<br>

> > > > > > <br>

> > > > > > So It’s not correct?<br>

> > > > > > <br>

> > > > > > On Thu, 7 Sep 2023 at 22:13 Satish Patel <<a href="mailto:satish.txt@gmail.com" target="_blank">satish.txt@gmail.com</a>> wrote:<br>

> > > > > > <br>

> > > > > > > Because DPDK required DPDK support inside guest VM. It's not<br>

> > > > > > > suitable for general purpose workload. You need your guest VM network to<br>

> > > > > > > support DPDK to get 100% throughput.<br>

> > > > > > > <br>

> > > > > > > On Thu, Sep 7, 2023 at 8:06 AM Ha Noi <<a href="mailto:hanoi952022@gmail.com" target="_blank">hanoi952022@gmail.com</a>> wrote:<br>

> > > > > > > <br>

> > > > > > > > Hi Satish,<br>

> > > > > > > > <br>

> > > > > > > > Why dont you use DPDK?<br>

> > > > > > > > <br>

> > > > > > > > Thanks<br>

> > > > > > > > <br>

> > > > > > > > On Thu, 7 Sep 2023 at 19:03 Satish Patel <<a href="mailto:satish.txt@gmail.com" target="_blank">satish.txt@gmail.com</a>><br>

> > > > > > > > wrote:<br>

> > > > > > > > <br>

> > > > > > > > > I totally agreed with Sean on all his points but trust me, I have<br>

> > > > > > > > > tried everything possible to tune OS, Network stack, multi-queue, NUMA, CPU<br>

> > > > > > > > > pinning and name it.. but I didn't get any significant improvement. You may<br>

> > > > > > > > > gain 2 to 5% gain with all those tweek. I am running the entire workload on<br>

> > > > > > > > > sriov and life is happy except no LACP bonding.<br>

> > > > > > > > > <br>

> > > > > > > > > I am very interesting is this project<br>

> > > > > > > > > <a href="https://docs.openvswitch.org/en/latest/intro/install/afxdp/" rel="noreferrer" target="_blank">https://docs.openvswitch.org/en/latest/intro/install/afxdp/</a><br>

> > > > > > > > > <br>

> > > > > > > > > On Thu, Sep 7, 2023 at 6:07 AM Ha Noi <<a href="mailto:hanoi952022@gmail.com" target="_blank">hanoi952022@gmail.com</a>><br>

> > > > > > > > > wrote:<br>

> > > > > > > > > <br>

> > > > > > > > > > Dear Smoney,<br>

> > > > > > > > > > <br>

> > > > > > > > > > <br>

> > > > > > > > > > <br>

> > > > > > > > > > On Thu, Sep 7, 2023 at 12:41 AM <<a href="mailto:smooney@redhat.com" target="_blank">smooney@redhat.com</a>> wrote:<br>

> > > > > > > > > > <br>

> > > > > > > > > > > On Wed, 2023-09-06 at 11:43 -0400, Satish Patel wrote:<br>

> > > > > > > > > > > > Damn! We have noticed the same issue around 40k to 55k PPS.<br>

> > > > > > > > > > > Trust me<br>

> > > > > > > > > > > > nothing is wrong in your config. This is just a limitation of<br>

> > > > > > > > > > > the software<br>

> > > > > > > > > > > > stack and kernel itself.<br>

> > > > > > > > > > > its partly determined by your cpu frequency.<br>

> > > > > > > > > > > kernel ovs of yesteryear could handel about 1mpps total on a ~4GHZ<br>

> > > > > > > > > > > cpu. with per port troughpuyt being lower dependin on what<br>

> > > > > > > > > > > qos/firewall<br>

> > > > > > > > > > > rules that were apllied.<br>

> > > > > > > > > > > <br>

> > > > > > > > > > > <br>

> > > > > > > > > > <br>

> > > > > > > > > > My CPU frequency is 3Ghz and using CPU Intel Gold 2nd generation.<br>

> > > > > > > > > > I think the problem is tuning in the compute node inside. But I cannot find<br>

> > > > > > > > > > any guide or best practices for it.<br>

> > > > > > > > > > <br>

> > > > > > > > > > <br>

> > > > > > > > > > <br>

> > > > > > > > > > > moving form iptables firewall to ovs firewall can help to some<br>

> > > > > > > > > > > degree<br>

> > > > > > > > > > > but your partly trading connection setup time for statead state<br>

> > > > > > > > > > > troughput<br>

> > > > > > > > > > > with the overhead of the connection tracker in ovs.<br>

> > > > > > > > > > > <br>

> > > > > > > > > > > using stateless security groups can help<br>

> > > > > > > > > > > <br>

> > > > > > > > > > > we also recently fixed a regression cause by changes in newer<br>

> > > > > > > > > > > versions of ovs.<br>

> > > > > > > > > > > this was notable in goign form rhel 8 to rhel 9 where litrally it<br>

> > > > > > > > > > > reduced<br>

> > > > > > > > > > > small packet performce to 1/10th and jumboframes to about 1/2<br>

> > > > > > > > > > > on master we have a config option that will set the default qos<br>

> > > > > > > > > > > on a port to linux-noop<br>

> > > > > > > > > > > <br>

> > > > > > > > > > > <a href="https://github.com/openstack/os-vif/blob/master/vif_plug_ovs/ovs.py#L106-L125" rel="noreferrer" target="_blank">https://github.com/openstack/os-vif/blob/master/vif_plug_ovs/ovs.py#L106-L125</a><br>

> > > > > > > > > > > <br>

> > > > > > > > > > > the backports are propsoed upstream<br>

> > > > > > > > > > > <a href="https://review.opendev.org/q/Id9ef7074634a0f23d67a4401fa8fca363b51bb43" rel="noreferrer" target="_blank">https://review.opendev.org/q/Id9ef7074634a0f23d67a4401fa8fca363b51bb43</a><br>

> > > > > > > > > > > and we have backported this downstream to adress that performance<br>

> > > > > > > > > > > regression.<br>

> > > > > > > > > > > the upstram backport is semi stalled just ebcasue we wanted to<br>

> > > > > > > > > > > disucss if we shoudl make ti opt in<br>

> > > > > > > > > > > by default upstream while backporting but it might be helpful for<br>

> > > > > > > > > > > you if this is related to yoru current<br>

> > > > > > > > > > > issues.<br>

> > > > > > > > > > > <br>

> > > > > > > > > > > 40-55 kpps is kind of low for kernel ovs but if you have a low<br>

> > > > > > > > > > > clockrate cpu, hybrid_plug + incorrect qos<br>

> > > > > > > > > > > then i could see you hitting such a bottelneck.<br>

> > > > > > > > > > > <br>

> > > > > > > > > > > one workaround by the way without the os-vif workaround<br>

> > > > > > > > > > > backported is to set<br>

> > > > > > > > > > > /proc/sys/net/core/default_qdisc to not apply any qos or a low<br>

> > > > > > > > > > > overhead qos type<br>

> > > > > > > > > > > i.e. sudo sysctl -w net.core.default_qdisc=pfifo_fast<br>

> > > > > > > > > > > <br>

> > > > > > > > > > > <br>

> > > > > > > > > > <br>

> > > > > > > > > > > that may or may not help but i would ensure that your are not<br>

> > > > > > > > > > > usign somting like fqdel or cake<br>

> > > > > > > > > > > for net.core.default_qdisc and if you are try changing it to<br>

> > > > > > > > > > > pfifo_fast and see if that helps.<br>

> > > > > > > > > > > <br>

> > > > > > > > > > > there isnet much you can do about the cpu clock rate but ^ is<br>

> > > > > > > > > > > somethign you can try for free<br>

> > > > > > > > > > > note it wont actully take effect on an exsitng vm if you jsut<br>

> > > > > > > > > > > change the default but you can use<br>

> > > > > > > > > > > tc to also chagne the qdisk for testing. hard rebooting the vm<br>

> > > > > > > > > > > shoudl also make the default take effect.<br>

> > > > > > > > > > > <br>

> > > > > > > > > > > the only other advice i can give assuming kernel ovs is the only<br>

> > > > > > > > > > > option you have is<br>

> > > > > > > > > > > <br>

> > > > > > > > > > > to look at<br>

> > > > > > > > > > > <br>

> > > > > > > > > > > <a href="https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.rx_queue_size" rel="noreferrer" target="_blank">https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.rx_queue_size</a><br>

> > > > > > > > > > > <br>

> > > > > > > > > > > <a href="https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.tx_queue_size" rel="noreferrer" target="_blank">https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.tx_queue_size</a><br>

> > > > > > > > > > > and<br>

> > > > > > > > > > > <br>

> > > > > > > > > > > <a href="https://docs.openstack.org/nova/latest/configuration/extra-specs.html#hw:vif_multiqueue_enabled" rel="noreferrer" target="_blank">https://docs.openstack.org/nova/latest/configuration/extra-specs.html#hw:vif_multiqueue_enabled</a><br>

> > > > > > > > > > > <br>

> > > > > > > > > > > if the bottelneck is actully in qemu or the guest kernel rather<br>

> > > > > > > > > > > then ovs adjusting the rx/tx queue size and<br>

> > > > > > > > > > > using multi queue can help. it will have no effect if ovs is the<br>

> > > > > > > > > > > bottel neck.<br>

> > > > > > > > > > > <br>

> > > > > > > > > > > <br>

> > > > > > > > > > > <br>

> > > > > > > > > > I have set this option to 1024, and enable multiqueue as well. But<br>

> > > > > > > > > > it did not help.<br>

> > > > > > > > > > <br>

> > > > > > > > > > <br>

> > > > > > > > > > > > <br>

> > > > > > > > > > > > On Wed, Sep 6, 2023 at 9:21 AM Ha Noi <<a href="mailto:hanoi952022@gmail.com" target="_blank">hanoi952022@gmail.com</a>><br>

> > > > > > > > > > > wrote:<br>

> > > > > > > > > > > > <br>

> > > > > > > > > > > > > Hi Satish,<br>

> > > > > > > > > > > > > <br>

> > > > > > > > > > > > > Actually, our customer get this issue when the tx/rx above<br>

> > > > > > > > > > > only 40k pps.<br>

> > > > > > > > > > > > > So what is the threshold of this throughput for OvS?<br>

> > > > > > > > > > > > > <br>

> > > > > > > > > > > > > <br>

> > > > > > > > > > > > > Thanks and regards<br>

> > > > > > > > > > > > > <br>

> > > > > > > > > > > > > On Wed, 6 Sep 2023 at 20:19 Satish Patel <<br>

> > > > > > > > > > > <a href="mailto:satish.txt@gmail.com" target="_blank">satish.txt@gmail.com</a>> wrote:<br>

> > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > Hi,<br>

> > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > This is normal because OVS or LinuxBridge wire up VMs using<br>

> > > > > > > > > > > TAP interface<br>

> > > > > > > > > > > > > > which runs on kernel space and that drives higher interrupt<br>

> > > > > > > > > > > and that makes<br>

> > > > > > > > > > > > > > the kernel so busy working on handling packets. Standard<br>

> > > > > > > > > > > OVS/LinuxBridge<br>

> > > > > > > > > > > > > > are not meant for higher PPS.<br>

> > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > If you want to handle higher PPS then look for DPDK or<br>

> > > > > > > > > > > SRIOV deployment.<br>

> > > > > > > > > > > > > > ( We are running everything in SRIOV because of high PPS<br>

> > > > > > > > > > > requirement)<br>

> > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > On Tue, Sep 5, 2023 at 11:11 AM Ha Noi <<br>

> > > > > > > > > > > <a href="mailto:hanoi952022@gmail.com" target="_blank">hanoi952022@gmail.com</a>> wrote:<br>

> > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > Hi everyone,<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > I'm using Openstack Train and Openvswitch for ML2 driver<br>

> > > > > > > > > > > and GRE for<br>

> > > > > > > > > > > > > > > tunnel type. I tested our network performance between two<br>

> > > > > > > > > > > VMs and suffer<br>

> > > > > > > > > > > > > > > packet loss as below.<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > VM1: IP: 10.20.1.206<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > VM2: IP: 10.20.1.154 <<a href="https://10.20.1.154/24" rel="noreferrer" target="_blank">https://10.20.1.154/24</a>><br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > VM3: IP: 10.20.1.72<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > Using iperf3 to testing performance between VM1 and VM2.<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > Run iperf3 client and server on both VMs.<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > On VM2: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c<br>

> > > > > > > > > > > 10.20.1.206<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > On VM1: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c<br>

> > > > > > > > > > > 10.20.1.154<br>

> > > > > > > > > > > > > > > <<a href="https://10.20.1.154/24" rel="noreferrer" target="_blank">https://10.20.1.154/24</a>><br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > Using VM3 ping into VM1, then the packet is lost and the<br>

> > > > > > > > > > > latency is<br>

> > > > > > > > > > > > > > > quite high.<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > ping -i 0.1 10.20.1.206<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > PING 10.20.1.206 (10.20.1.206) 56(84) bytes of data.<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=1 ttl=64 time=7.70 ms<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=2 ttl=64 time=6.90 ms<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=3 ttl=64 time=7.71 ms<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=4 ttl=64 time=7.98 ms<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=6 ttl=64 time=8.58 ms<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=7 ttl=64 time=8.34 ms<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=8 ttl=64 time=8.09 ms<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=10 ttl=64 time=4.57<br>

> > > > > > > > > > > ms<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=11 ttl=64 time=8.74<br>

> > > > > > > > > > > ms<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=12 ttl=64 time=9.37<br>

> > > > > > > > > > > ms<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=14 ttl=64 time=9.59<br>

> > > > > > > > > > > ms<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=15 ttl=64 time=7.97<br>

> > > > > > > > > > > ms<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=16 ttl=64 time=8.72<br>

> > > > > > > > > > > ms<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=17 ttl=64 time=9.23<br>

> > > > > > > > > > > ms<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > ^C<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > --- 10.20.1.206 ping statistics ---<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > 34 packets transmitted, 28 received, 17.6471% packet<br>

> > > > > > > > > > > loss, time 3328ms<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > rtt min/avg/max/mdev = 1.396/6.266/9.590/2.805 ms<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > Does any one get this issue ?<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > > Please help me. Thanks<br>

> > > > > > > > > > > > > > > <br>

> > > > > > > > > > > > > > <br>

> > > > > > > > > > > <br>

> > > > > > > > > > > <br>

<br>

</blockquote></div>