OVS-DPDK poor performance with Intel 82599

Satish Patel satish.txt at gmail.com
Fri Nov 27 23:19:47 UTC 2020


Sean,

Here is the full list of requested output :
http://paste.openstack.org/show/800515/

In the above output I am noticing ovs_tx_failure_drops=38000 and at
the same time Trex also showing me the same number of drops in its
result. I will try to dig into ARP flood, you are saying ARP flood
will be inside OVS switch right? any command or any specific thing i
should look for?

On Fri, Nov 27, 2020 at 8:49 AM Sean Mooney <smooney at redhat.com> wrote:
>
> On Thu, 2020-11-26 at 22:10 -0500, Satish Patel wrote:
> > Sean,
> >
> > Let me say "Happy Thanksgiving to you and your family". Thank you for
> > taking time and reply, the last 2 days I was trying to find you on IRC
> > to discuss this issue. Let me explain to you what I did so far.
> >
> > * First i did load-testing on my bare metal compute node to see how
> > far my Trex can go and i found it Hit 2 million packet per second (Not
> > sure if this is good result or not but it prove that i can hit at
> > least 1 million pps)
> for 64byte packets on that nic it should be hittihng about 11mpps on one core.
> that said i have not validated that in a year or two but it could eaisly saturate 10G linerate with 64b
> packest with 1 cores in the past.
> >
> > * Then i create sriov VM on that compute node with ( 8vCPU/8GB mem)
> > and i re-run Trex and my max result was 323kpps without dropping
> > packet)  I found Intel 82599 nic VF only support 2 queue rx/tx and
> > that could be bottleneck)
> a VF can fully saturate the nic and hit 14.4 mpps if your cpu clock rate is fstat enough
>
> i.e. >3.2-3.5GHz on a 2.5GHz you porably wont hit that with 1 core but you shoudl get >10mpps
>
> >
> > * Finally i decided to build DPDK vm on it and see how Trex behaved on
> > it and i found it hit max ~400kpps with 4 PMD core. (little better
> > than sriov because now i have 4 rx/tx queue thanks to 4 PMD core)
>
> ya so these number are far to low for a correctly complied and fuctioning trex binary
>
> >
> > For Trex load-test i did statically assigned ARP entries because its
> > part of Trex process to use static arp.
> >
> that wont work properly. if you do that the ovs bridge will not have its mac learning
> table populated so it will flood packets.  to do dpdk
> > You are saying it should hit
> > 11 million pps but question is what tools you guys using to hit that
> > number i didn't see anyone using Trex for DPDK testing most of people
> > using testpmd.
> trex is a trafic generaotr orginally created by cisco i think
> it often used in combination with testpmd. testpmd was desing to these the pool
> mode driver as the name implice but its also uses as a replacement for a device/appliction
> under test to messure the low level performacne in l2/l3 forading modes or basic mac swap mode.
>
>
> >
> > what kind of vm and (vCPU/memory people using to reach 11 million
> > pps?)
> >
>
> 2-4 vcpus with 2G or ram.
> if dpdk is compile propertly and funtionion you dont need a lot of core although you will  need
> to use cpu pinning and hugepages for the vm and within the vm you will also need hugpeages if you are using dpdk there too.
>
> >  I am stick to 8 vcpu because majority of my server has 8 core VM
> > size so trying to get most of performance out of it.)
> >
> > If you have your load-test scenario available or tools available then
> > please share some information so i will try to mimic that in my
> > environment.  thank you for reply.
>
> i think you need to start with getting trex to actully hit 10G linerate with small packets.
> as i said you should not need more then about 2 cores to do that and 1-2 G of hugepages.
>
> once you have tha tworking you can move on to the rest but you need to ensure proper mac learning happens and arps are sent and replied
> too before starting the traffic generattor so that floodign does not happen.
> can you also provide the output of
>
> sudo ovs-vsctl list Open_vSwitch .
> and the output of
> sudo ovs-vsctl show, sudo ovs-vsctl list bridge, sudo ovs-vsctl list port and  sudo ovs-vsctl list interface
>
> i just want to confirm that you have properly confiugred ovs-dpdk to use dpdk
>
> i dont work with dpdk that offent any more but i generally used testpmd in the guest with an ixia hardware traffic generator
> to do performance messurments. i have used trex and it can hit line rate so im not sure why you are seeign such low performance.
>
> >
> > ~S
> >
> >
> > On Thu, Nov 26, 2020 at 8:14 PM Sean Mooney <smooney at redhat.com> wrote:
> > >
> > > On Thu, 2020-11-26 at 16:56 -0500, Satish Patel wrote:
> > > > Folks,
> > > >
> > > > I am playing with DPDK on my openstack with NIC model 82599 and seeing
> > > > poor performance, i may be wrong with my numbers so want to see what
> > > > the community thinks about these results.
> > > >
> > > > Compute node hardware:
> > > >
> > > > CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
> > > > Memory: 64G
> > > > NIC: Intel 82599 (dual 10G port)
> > > >
> > > > [root at compute-lxb-3 ~]# ovs-vswitchd --version
> > > > ovs-vswitchd (Open vSwitch) 2.13.2
> > > > DPDK 19.11.3
> > > >
> > > > VM dpdk (DUT):
> > > > 8vCPU / 8GB memory
> > > >
> > > > I have configured my computer node for all best practice available on
> > > > the internet to get more performance out.
> > > >
> > > > 1. Used isolcpus to islocate CPUs
> > > > 2. 4 dedicated core for PMD
> > > > 3. echo isolated_cores=1,9,25,33 >> /etc/tuned/cpu-partitioning-variables.conf
> > > > 4. Huge pages
> > > > 5. CPU pinning for VM
> > > > 6. increase  ( ovs-vsctl set interface dpdk-1 options:n_rxq=4 )
> > > > 7. VM virtio_ring = 1024
> > > >
> > > > After doing all above I am getting the following result using the Trex
> > > > packet generator using 64B UDP stream (Total-PPS       :     391.93
> > > > Kpps)  Do you think it's an acceptable result or should it be higher
> > > > on these NIC models?
> > > that is one of inteles oldest generation 10G nics that is supported by dpdk
> > >
> > > but it shoudl still get to about 11 million packet per second with 1-2 cores
> > >
> > > my guess would be that the vm or trafic gerneator are not sentding and reciving mac learnign
> > > frames like arp properly and as a result the packtes are flooding which will severly
> > > reduce perfomance.
> > > >
> > > > On the internet folks say it should be a million packets per second so
> > > > not sure what and how those people reached there or i am missing
> > > > something in my load test profile.
> > >
> > > even kernel ovs will break a million packets persecond so 400Kpps is far to low
> > > there is sometin gmisconfigred but  im not sure what specificly form what you have shared.
> > > as i said my best guess would be that the backets are flooding because the vm is not
> > > responding to arp and the normal action is not learn the mac address.
> > >
> > > you could rule that out by adding hardcoded rules but you could also check the flow tables to confirm
> > > >
> > > > Notes: I am using 8vCPU core on VM do you think adding more cores will
> > > > help? OR should i add more PMD?
> > > >
> > > > Cpu Utilization : 2.2  %  1.8 Gb/core
> > > >  Platform_factor : 1.0
> > > >  Total-Tx        :     200.67 Mbps
> > > >  Total-Rx        :     200.67 Mbps
> > > >  Total-PPS       :     391.93 Kpps
> > > >  Total-CPS       :     391.89 Kcps
> > > >
> > > >  Expected-PPS    :     700.00 Kpps
> > > >  Expected-CPS    :     700.00 Kcps
> > > >  Expected-BPS    :     358.40 Mbps
> > > >
> > > >
> > > > This is my all configuration:
> > > >
> > > > grub.conf:
> > > > GRUB_CMDLINE_LINUX="vmalloc=384M crashkernel=auto
> > > > rd.lvm.lv=rootvg01/lv01 console=ttyS1,118200 rhgb quiet intel_iommu=on
> > > > iommu=pt spectre_v2=off nopti pti=off nospec_store_bypass_disable
> > > > spec_store_bypass_disable=off l1tf=off default_hugepagesz=1GB
> > > > hugepagesz=1G hugepages=60 transparent_hugepage=never selinux=0
> > > > isolcpus=2,3,4,5,6,7,10,11,12,13,14,15,26,27,28,29,30,31,34,35,36,37,38,39"
> > > >
> > > >
> > > > [root at compute-lxb-3 ~]# ovs-appctl dpif/show
> > > > netdev at ovs-netdev: hit:605860720 missed:2129
> > > >   br-int:
> > > >     br-int 65534/3: (tap)
> > > >     int-br-vlan 1/none: (patch: peer=phy-br-vlan)
> > > >     patch-tun 2/none: (patch: peer=patch-int)
> > > >     vhu1d64ea7d-d9 5/6: (dpdkvhostuserclient: configured_rx_queues=8,
> > > > configured_tx_queues=8, mtu=1500, requested_rx_queues=8,
> > > > requested_tx_queues=8)
> > > >     vhu9c32faf6-ac 6/7: (dpdkvhostuserclient: configured_rx_queues=8,
> > > > configured_tx_queues=8, mtu=1500, requested_rx_queues=8,
> > > > requested_tx_queues=8)
> > > >   br-tun:
> > > >     br-tun 65534/4: (tap)
> > > >     patch-int 1/none: (patch: peer=patch-tun)
> > > >     vxlan-0a410071 2/5: (vxlan: egress_pkt_mark=0, key=flow,
> > > > local_ip=10.65.0.114, remote_ip=10.65.0.113)
> > > >   br-vlan:
> > > >     br-vlan 65534/1: (tap)
> > > >     dpdk-1 2/2: (dpdk: configured_rx_queues=4,
> > > > configured_rxq_descriptors=2048, configured_tx_queues=5,
> > > > configured_txq_descriptors=2048, lsc_interrupt_mode=false, mtu=1500,
> > > > requested_rx_queues=4, requested_rxq_descriptors=2048,
> > > > requested_tx_queues=5, requested_txq_descriptors=2048,
> > > > rx_csum_offload=true, tx_tso_offload=false)
> > > >     phy-br-vlan 1/none: (patch: peer=int-br-vlan)
> > > >
> > > >
> > > > [root at compute-lxb-3 ~]# ovs-appctl dpif-netdev/pmd-rxq-show
> > > > pmd thread numa_id 0 core_id 1:
> > > >   isolated : false
> > > >   port: dpdk-1            queue-id:  0 (enabled)   pmd usage:  0 %
> > > >   port: vhu1d64ea7d-d9    queue-id:  3 (enabled)   pmd usage:  0 %
> > > >   port: vhu1d64ea7d-d9    queue-id:  4 (enabled)   pmd usage:  0 %
> > > >   port: vhu9c32faf6-ac    queue-id:  3 (enabled)   pmd usage:  0 %
> > > >   port: vhu9c32faf6-ac    queue-id:  4 (enabled)   pmd usage:  0 %
> > > > pmd thread numa_id 0 core_id 9:
> > > >   isolated : false
> > > >   port: dpdk-1            queue-id:  1 (enabled)   pmd usage:  0 %
> > > >   port: vhu1d64ea7d-d9    queue-id:  2 (enabled)   pmd usage:  0 %
> > > >   port: vhu1d64ea7d-d9    queue-id:  5 (enabled)   pmd usage:  0 %
> > > >   port: vhu9c32faf6-ac    queue-id:  2 (enabled)   pmd usage:  0 %
> > > >   port: vhu9c32faf6-ac    queue-id:  5 (enabled)   pmd usage:  0 %
> > > > pmd thread numa_id 0 core_id 25:
> > > >   isolated : false
> > > >   port: dpdk-1            queue-id:  3 (enabled)   pmd usage:  0 %
> > > >   port: vhu1d64ea7d-d9    queue-id:  0 (enabled)   pmd usage:  0 %
> > > >   port: vhu1d64ea7d-d9    queue-id:  7 (enabled)   pmd usage:  0 %
> > > >   port: vhu9c32faf6-ac    queue-id:  0 (enabled)   pmd usage:  0 %
> > > >   port: vhu9c32faf6-ac    queue-id:  7 (enabled)   pmd usage:  0 %
> > > > pmd thread numa_id 0 core_id 33:
> > > >   isolated : false
> > > >   port: dpdk-1            queue-id:  2 (enabled)   pmd usage:  0 %
> > > >   port: vhu1d64ea7d-d9    queue-id:  1 (enabled)   pmd usage:  0 %
> > > >   port: vhu1d64ea7d-d9    queue-id:  6 (enabled)   pmd usage:  0 %
> > > >   port: vhu9c32faf6-ac    queue-id:  1 (enabled)   pmd usage:  0 %
> > > >   port: vhu9c32faf6-ac    queue-id:  6 (enabled)   pmd usage:  0 %
> > > >
> > >
> > >
> > >
> >
>
>



More information about the openstack-discuss mailing list