Because DPDK required DPDK support inside guest VM. It's not suitable for general purpose workload. You need your guest VM network to support DPDK to get 100% throughput. On Thu, Sep 7, 2023 at 8:06 AM Ha Noi <hanoi952022@gmail.com> wrote:
Hi Satish,
Why dont you use DPDK?
Thanks
On Thu, 7 Sep 2023 at 19:03 Satish Patel <satish.txt@gmail.com> wrote:
I totally agreed with Sean on all his points but trust me, I have tried everything possible to tune OS, Network stack, multi-queue, NUMA, CPU pinning and name it.. but I didn't get any significant improvement. You may gain 2 to 5% gain with all those tweek. I am running the entire workload on sriov and life is happy except no LACP bonding.
I am very interesting is this project https://docs.openvswitch.org/en/latest/intro/install/afxdp/
On Thu, Sep 7, 2023 at 6:07 AM Ha Noi <hanoi952022@gmail.com> wrote:
Dear Smoney,
On Thu, Sep 7, 2023 at 12:41 AM <smooney@redhat.com> wrote:
On Wed, 2023-09-06 at 11:43 -0400, Satish Patel wrote:
Damn! We have noticed the same issue around 40k to 55k PPS. Trust me nothing is wrong in your config. This is just a limitation of the software stack and kernel itself. its partly determined by your cpu frequency. kernel ovs of yesteryear could handel about 1mpps total on a ~4GHZ cpu. with per port troughpuyt being lower dependin on what qos/firewall rules that were apllied.
My CPU frequency is 3Ghz and using CPU Intel Gold 2nd generation. I think the problem is tuning in the compute node inside. But I cannot find any guide or best practices for it.
moving form iptables firewall to ovs firewall can help to some degree but your partly trading connection setup time for statead state troughput with the overhead of the connection tracker in ovs.
using stateless security groups can help
we also recently fixed a regression cause by changes in newer versions of ovs. this was notable in goign form rhel 8 to rhel 9 where litrally it reduced small packet performce to 1/10th and jumboframes to about 1/2 on master we have a config option that will set the default qos on a port to linux-noop
https://github.com/openstack/os-vif/blob/master/vif_plug_ovs/ovs.py#L106-L12...
the backports are propsoed upstream https://review.opendev.org/q/Id9ef7074634a0f23d67a4401fa8fca363b51bb43 and we have backported this downstream to adress that performance regression. the upstram backport is semi stalled just ebcasue we wanted to disucss if we shoudl make ti opt in by default upstream while backporting but it might be helpful for you if this is related to yoru current issues.
40-55 kpps is kind of low for kernel ovs but if you have a low clockrate cpu, hybrid_plug + incorrect qos then i could see you hitting such a bottelneck.
one workaround by the way without the os-vif workaround backported is to set /proc/sys/net/core/default_qdisc to not apply any qos or a low overhead qos type i.e. sudo sysctl -w net.core.default_qdisc=pfifo_fast
that may or may not help but i would ensure that your are not usign somting like fqdel or cake for net.core.default_qdisc and if you are try changing it to pfifo_fast and see if that helps.
there isnet much you can do about the cpu clock rate but ^ is somethign you can try for free note it wont actully take effect on an exsitng vm if you jsut change the default but you can use tc to also chagne the qdisk for testing. hard rebooting the vm shoudl also make the default take effect.
the only other advice i can give assuming kernel ovs is the only option you have is
to look at
https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.rx_...
https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.tx_... and
https://docs.openstack.org/nova/latest/configuration/extra-specs.html#hw:vif...
if the bottelneck is actully in qemu or the guest kernel rather then ovs adjusting the rx/tx queue size and using multi queue can help. it will have no effect if ovs is the bottel neck.
I have set this option to 1024, and enable multiqueue as well. But it did not help.
On Wed, Sep 6, 2023 at 9:21 AM Ha Noi <hanoi952022@gmail.com> wrote:
Hi Satish,
Actually, our customer get this issue when the tx/rx above only 40k
So what is the threshold of this throughput for OvS?
Thanks and regards
On Wed, 6 Sep 2023 at 20:19 Satish Patel <satish.txt@gmail.com> wrote:
> Hi, > > This is normal because OVS or LinuxBridge wire up VMs using TAP interface > which runs on kernel space and that drives higher interrupt and
pps. that makes
> the kernel so busy working on handling packets. Standard OVS/LinuxBridge > are not meant for higher PPS. > > If you want to handle higher PPS then look for DPDK or SRIOV deployment. > ( We are running everything in SRIOV because of high PPS requirement) > > On Tue, Sep 5, 2023 at 11:11 AM Ha Noi <hanoi952022@gmail.com> wrote: > > > Hi everyone, > > > > I'm using Openstack Train and Openvswitch for ML2 driver and GRE for > > tunnel type. I tested our network performance between two VMs and suffer > > packet loss as below. > > > > VM1: IP: 10.20.1.206 > > > > VM2: IP: 10.20.1.154 <https://10.20.1.154/24> > > > > VM3: IP: 10.20.1.72 > > > > > > Using iperf3 to testing performance between VM1 and VM2. > > > > Run iperf3 client and server on both VMs. > > > > On VM2: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.206 > > > > On VM1: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.154 > > <https://10.20.1.154/24> > > > > > > Using VM3 ping into VM1, then the packet is lost and the latency is > > quite high. > > > > > > ping -i 0.1 10.20.1.206 > > > > PING 10.20.1.206 (10.20.1.206) 56(84) bytes of data. > > > > 64 bytes from 10.20.1.206: icmp_seq=1 ttl=64 time=7.70 ms > > > > 64 bytes from 10.20.1.206: icmp_seq=2 ttl=64 time=6.90 ms > > > > 64 bytes from 10.20.1.206: icmp_seq=3 ttl=64 time=7.71 ms > > > > 64 bytes from 10.20.1.206: icmp_seq=4 ttl=64 time=7.98 ms > > > > 64 bytes from 10.20.1.206: icmp_seq=6 ttl=64 time=8.58 ms > > > > 64 bytes from 10.20.1.206: icmp_seq=7 ttl=64 time=8.34 ms > > > > 64 bytes from 10.20.1.206: icmp_seq=8 ttl=64 time=8.09 ms > > > > 64 bytes from 10.20.1.206: icmp_seq=10 ttl=64 time=4.57 ms > > > > 64 bytes from 10.20.1.206: icmp_seq=11 ttl=64 time=8.74 ms > > > > 64 bytes from 10.20.1.206: icmp_seq=12 ttl=64 time=9.37 ms > > > > 64 bytes from 10.20.1.206: icmp_seq=14 ttl=64 time=9.59 ms > > > > 64 bytes from 10.20.1.206: icmp_seq=15 ttl=64 time=7.97 ms > > > > 64 bytes from 10.20.1.206: icmp_seq=16 ttl=64 time=8.72 ms > > > > 64 bytes from 10.20.1.206: icmp_seq=17 ttl=64 time=9.23 ms > > > > ^C > > > > --- 10.20.1.206 ping statistics --- > > > > 34 packets transmitted, 28 received, 17.6471% packet loss, time 3328ms > > > > rtt min/avg/max/mdev = 1.396/6.266/9.590/2.805 ms > > > > > > > > Does any one get this issue ? > > > > Please help me. Thanks > > >