<div dir="ltr">I would say let's run your same benchmark with OVS-DPDK and tell me if you see better performance. I doubt you will see significant performance boot but lets see. Please prove me wrong :) </div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Sep 7, 2023 at 9:45 PM Ha Noi <<a href="mailto:hanoi952022@gmail.com">hanoi952022@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hi Satish,<div><br></div><div>Actually, the guess interface is not using tap anymore.</div><div><br></div><div> <interface type='vhostuser'><br> <mac address='fa:16:3e:76:77:dd'/><br> <source type='unix' path='/var/run/openvswitch/vhu3766ee8a-86' mode='server'/><br> <target dev='vhu3766ee8a-86'/><br> <model type='virtio'/><br> <alias name='net0'/><br> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/><br> </interface><br></div><div><br></div><div>It's totally bypass the kernel stack ?</div><div><br></div><div><br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Sep 8, 2023 at 5:02 AM Satish Patel <<a href="mailto:satish.txt@gmail.com" target="_blank">satish.txt@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">I did test OVS-DPDK and it helps offload the packet process on compute nodes, But what about VMs it will still use a tap interface to attach from compute to vm and bottleneck will be in vm. I strongly believe that we have to run DPDK based guest to pass through the kernel stack. <div><br></div><div>I love to hear from other people if I am missing something here. </div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Sep 7, 2023 at 5:27 PM Ha Noi <<a href="mailto:hanoi952022@gmail.com" target="_blank">hanoi952022@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="auto">Oh. I heard from someone on the reddit said that Ovs-dpdk is transparent with user?</div><div dir="auto"><br></div><div dir="auto">So It’s not correct?</div><div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, 7 Sep 2023 at 22:13 Satish Patel <<a href="mailto:satish.txt@gmail.com" target="_blank">satish.txt@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Because DPDK required DPDK support inside guest VM. It's not suitable for general purpose workload. You need your guest VM network to support DPDK to get 100% throughput. </div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Sep 7, 2023 at 8:06 AM Ha Noi <<a href="mailto:hanoi952022@gmail.com" target="_blank">hanoi952022@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="auto">Hi Satish,</div><div dir="auto"><br></div><div dir="auto">Why dont you use DPDK?</div><div dir="auto"><br></div><div dir="auto">Thanks </div><div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, 7 Sep 2023 at 19:03 Satish Patel <<a href="mailto:satish.txt@gmail.com" target="_blank">satish.txt@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">I totally agreed with Sean on all his points but trust me, I have tried everything possible to tune OS, Network stack, multi-queue, NUMA, CPU pinning and name it.. but I didn't get any significant improvement. You may gain 2 to 5% gain with all those tweek. I am running the entire workload on sriov and life is happy except no LACP bonding. <div><br></div><div>I am very interesting is this project <a href="https://docs.openvswitch.org/en/latest/intro/install/afxdp/" target="_blank">https://docs.openvswitch.org/en/latest/intro/install/afxdp/</a> </div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Sep 7, 2023 at 6:07 AM Ha Noi <<a href="mailto:hanoi952022@gmail.com" target="_blank">hanoi952022@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr">Dear Smoney, <div><br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Sep 7, 2023 at 12:41 AM <<a href="mailto:smooney@redhat.com" target="_blank">smooney@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Wed, 2023-09-06 at 11:43 -0400, Satish Patel wrote:<br>
> Damn! We have noticed the same issue around 40k to 55k PPS. Trust me<br>
> nothing is wrong in your config. This is just a limitation of the software<br>
> stack and kernel itself.<br>
its partly determined by your cpu frequency.<br>
kernel ovs of yesteryear could handel about 1mpps total on a ~4GHZ<br>
cpu. with per port troughpuyt being lower dependin on what qos/firewall<br>
rules that were apllied.<br>
<br></blockquote><div><br></div><div><br></div><div>My CPU frequency is 3Ghz and using CPU Intel Gold 2nd generation. I think the problem is tuning in the compute node inside. But I cannot find any guide or best practices for it.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
moving form iptables firewall to ovs firewall can help to some degree<br>
but your partly trading connection setup time for statead state troughput<br>
with the overhead of the connection tracker in ovs.<br>
<br>
using stateless security groups can help<br>
<br>
we also recently fixed a regression cause by changes in newer versions of ovs.<br>
this was notable in goign form rhel 8 to rhel 9 where litrally it reduced<br>
small packet performce to 1/10th and jumboframes to about 1/2<br>
on master we have a config option that will set the default qos on a port to linux-noop<br>
<a href="https://github.com/openstack/os-vif/blob/master/vif_plug_ovs/ovs.py#L106-L125" rel="noreferrer" target="_blank">https://github.com/openstack/os-vif/blob/master/vif_plug_ovs/ovs.py#L106-L125</a><br>
<br>
the backports are propsoed upstream <a href="https://review.opendev.org/q/Id9ef7074634a0f23d67a4401fa8fca363b51bb43" rel="noreferrer" target="_blank">https://review.opendev.org/q/Id9ef7074634a0f23d67a4401fa8fca363b51bb43</a><br>
and we have backported this downstream to adress that performance regression.<br>
the upstram backport is semi stalled just ebcasue we wanted to disucss if we shoudl make ti opt in<br>
by default upstream while backporting but it might be helpful for you if this is related to yoru current<br>
issues.<br>
<br>
40-55 kpps is kind of low for kernel ovs but if you have a low clockrate cpu, hybrid_plug + incorrect qos<br>
then i could see you hitting such a bottelneck.<br>
<br>
one workaround by the way without the os-vif workaround backported is to set <br>
/proc/sys/net/core/default_qdisc to not apply any qos or a low overhead qos type<br>
i.e. sudo sysctl -w net.core.default_qdisc=pfifo_fast<br>
<br></blockquote><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
that may or may not help but i would ensure that your are not usign somting like fqdel or cake<br>
for net.core.default_qdisc and if you are try changing it to pfifo_fast and see if that helps.<br>
<br>
there isnet much you can do about the cpu clock rate but ^ is somethign you can try for free<br>
note it wont actully take effect on an exsitng vm if you jsut change the default but you can use<br>
tc to also chagne the qdisk for testing. hard rebooting the vm shoudl also make the default take effect.<br>
<br>
the only other advice i can give assuming kernel ovs is the only option you have is<br>
<br>
to look at<br>
<a href="https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.rx_queue_size" rel="noreferrer" target="_blank">https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.rx_queue_size</a><br>
<a href="https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.tx_queue_size" rel="noreferrer" target="_blank">https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.tx_queue_size</a><br>
and<br>
<a href="https://docs.openstack.org/nova/latest/configuration/extra-specs.html#hw:vif_multiqueue_enabled" rel="noreferrer" target="_blank">https://docs.openstack.org/nova/latest/configuration/extra-specs.html#hw:vif_multiqueue_enabled</a><br>
<br>
if the bottelneck is actully in qemu or the guest kernel rather then ovs adjusting the rx/tx queue size and<br>
using multi queue can help. it will have no effect if ovs is the bottel neck.<br>
<br>
<br></blockquote><div><br></div><div>I have set this option to 1024, and enable multiqueue as well. But it did not help.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
> <br>
> On Wed, Sep 6, 2023 at 9:21 AM Ha Noi <<a href="mailto:hanoi952022@gmail.com" target="_blank">hanoi952022@gmail.com</a>> wrote:<br>
> <br>
> > Hi Satish,<br>
> > <br>
> > Actually, our customer get this issue when the tx/rx above only 40k pps.<br>
> > So what is the threshold of this throughput for OvS?<br>
> > <br>
> > <br>
> > Thanks and regards<br>
> > <br>
> > On Wed, 6 Sep 2023 at 20:19 Satish Patel <<a href="mailto:satish.txt@gmail.com" target="_blank">satish.txt@gmail.com</a>> wrote:<br>
> > <br>
> > > Hi,<br>
> > > <br>
> > > This is normal because OVS or LinuxBridge wire up VMs using TAP interface<br>
> > > which runs on kernel space and that drives higher interrupt and that makes<br>
> > > the kernel so busy working on handling packets. Standard OVS/LinuxBridge<br>
> > > are not meant for higher PPS.<br>
> > > <br>
> > > If you want to handle higher PPS then look for DPDK or SRIOV deployment.<br>
> > > ( We are running everything in SRIOV because of high PPS requirement)<br>
> > > <br>
> > > On Tue, Sep 5, 2023 at 11:11 AM Ha Noi <<a href="mailto:hanoi952022@gmail.com" target="_blank">hanoi952022@gmail.com</a>> wrote:<br>
> > > <br>
> > > > Hi everyone,<br>
> > > > <br>
> > > > I'm using Openstack Train and Openvswitch for ML2 driver and GRE for<br>
> > > > tunnel type. I tested our network performance between two VMs and suffer<br>
> > > > packet loss as below.<br>
> > > > <br>
> > > > VM1: IP: 10.20.1.206<br>
> > > > <br>
> > > > VM2: IP: 10.20.1.154 <<a href="https://10.20.1.154/24" rel="noreferrer" target="_blank">https://10.20.1.154/24</a>><br>
> > > > <br>
> > > > VM3: IP: 10.20.1.72<br>
> > > > <br>
> > > > <br>
> > > > Using iperf3 to testing performance between VM1 and VM2.<br>
> > > > <br>
> > > > Run iperf3 client and server on both VMs.<br>
> > > > <br>
> > > > On VM2: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.206<br>
> > > > <br>
> > > > On VM1: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.154<br>
> > > > <<a href="https://10.20.1.154/24" rel="noreferrer" target="_blank">https://10.20.1.154/24</a>><br>
> > > > <br>
> > > > <br>
> > > > Using VM3 ping into VM1, then the packet is lost and the latency is<br>
> > > > quite high.<br>
> > > > <br>
> > > > <br>
> > > > ping -i 0.1 10.20.1.206<br>
> > > > <br>
> > > > PING 10.20.1.206 (10.20.1.206) 56(84) bytes of data.<br>
> > > > <br>
> > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=1 ttl=64 time=7.70 ms<br>
> > > > <br>
> > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=2 ttl=64 time=6.90 ms<br>
> > > > <br>
> > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=3 ttl=64 time=7.71 ms<br>
> > > > <br>
> > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=4 ttl=64 time=7.98 ms<br>
> > > > <br>
> > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=6 ttl=64 time=8.58 ms<br>
> > > > <br>
> > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=7 ttl=64 time=8.34 ms<br>
> > > > <br>
> > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=8 ttl=64 time=8.09 ms<br>
> > > > <br>
> > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=10 ttl=64 time=4.57 ms<br>
> > > > <br>
> > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=11 ttl=64 time=8.74 ms<br>
> > > > <br>
> > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=12 ttl=64 time=9.37 ms<br>
> > > > <br>
> > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=14 ttl=64 time=9.59 ms<br>
> > > > <br>
> > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=15 ttl=64 time=7.97 ms<br>
> > > > <br>
> > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=16 ttl=64 time=8.72 ms<br>
> > > > <br>
> > > > 64 bytes from <a href="http://10.20.1.206" rel="noreferrer" target="_blank">10.20.1.206</a>: icmp_seq=17 ttl=64 time=9.23 ms<br>
> > > > <br>
> > > > ^C<br>
> > > > <br>
> > > > --- 10.20.1.206 ping statistics ---<br>
> > > > <br>
> > > > 34 packets transmitted, 28 received, 17.6471% packet loss, time 3328ms<br>
> > > > <br>
> > > > rtt min/avg/max/mdev = 1.396/6.266/9.590/2.805 ms<br>
> > > > <br>
> > > > <br>
> > > > <br>
> > > > Does any one get this issue ?<br>
> > > > <br>
> > > > Please help me. Thanks<br>
> > > > <br>
> > > <br>
<br>
</blockquote></div></div>
</blockquote></div>
</blockquote></div></div>
</blockquote></div>
</blockquote></div></div>
</blockquote></div>
</blockquote></div>
</blockquote></div>