OVS-DPDK poor performance with Intel 82599

Sean Mooney smooney at redhat.com
Fri Nov 27 13:49:01 UTC 2020


On Thu, 2020-11-26 at 22:10 -0500, Satish Patel wrote:
> Sean,
> 
> Let me say "Happy Thanksgiving to you and your family". Thank you for
> taking time and reply, the last 2 days I was trying to find you on IRC
> to discuss this issue. Let me explain to you what I did so far.
> 
> * First i did load-testing on my bare metal compute node to see how
> far my Trex can go and i found it Hit 2 million packet per second (Not
> sure if this is good result or not but it prove that i can hit at
> least 1 million pps)
for 64byte packets on that nic it should be hittihng about 11mpps on one core.
that said i have not validated that in a year or two but it could eaisly saturate 10G linerate with 64b
packest with 1 cores in the past.
> 
> * Then i create sriov VM on that compute node with ( 8vCPU/8GB mem)
> and i re-run Trex and my max result was 323kpps without dropping
> packet)  I found Intel 82599 nic VF only support 2 queue rx/tx and
> that could be bottleneck)
a VF can fully saturate the nic and hit 14.4 mpps if your cpu clock rate is fstat enough

i.e. >3.2-3.5GHz on a 2.5GHz you porably wont hit that with 1 core but you shoudl get >10mpps

> 
> * Finally i decided to build DPDK vm on it and see how Trex behaved on
> it and i found it hit max ~400kpps with 4 PMD core. (little better
> than sriov because now i have 4 rx/tx queue thanks to 4 PMD core)

ya so these number are far to low for a correctly complied and fuctioning trex binary

> 
> For Trex load-test i did statically assigned ARP entries because its
> part of Trex process to use static arp. 
> 
that wont work properly. if you do that the ovs bridge will not have its mac learning
table populated so it will flood packets.  to do dpdk
> You are saying it should hit
> 11 million pps but question is what tools you guys using to hit that
> number i didn't see anyone using Trex for DPDK testing most of people
> using testpmd.
trex is a trafic generaotr orginally created by cisco i think
it often used in combination with testpmd. testpmd was desing to these the pool
mode driver as the name implice but its also uses as a replacement for a device/appliction
under test to messure the low level performacne in l2/l3 forading modes or basic mac swap mode.


> 
> what kind of vm and (vCPU/memory people using to reach 11 million
> pps?)
> 

2-4 vcpus with 2G or ram.
if dpdk is compile propertly and funtionion you dont need a lot of core although you will  need
to use cpu pinning and hugepages for the vm and within the vm you will also need hugpeages if you are using dpdk there too.

>  I am stick to 8 vcpu because majority of my server has 8 core VM
> size so trying to get most of performance out of it.)
> 
> If you have your load-test scenario available or tools available then
> please share some information so i will try to mimic that in my
> environment.  thank you for reply.

i think you need to start with getting trex to actully hit 10G linerate with small packets.
as i said you should not need more then about 2 cores to do that and 1-2 G of hugepages.

once you have tha tworking you can move on to the rest but you need to ensure proper mac learning happens and arps are sent and replied
too before starting the traffic generattor so that floodign does not happen.
can you also provide the output of 

sudo ovs-vsctl list Open_vSwitch .
and the output of 
sudo ovs-vsctl show, sudo ovs-vsctl list bridge, sudo ovs-vsctl list port and  sudo ovs-vsctl list interface

i just want to confirm that you have properly confiugred ovs-dpdk to use dpdk

i dont work with dpdk that offent any more but i generally used testpmd in the guest with an ixia hardware traffic generator
to do performance messurments. i have used trex and it can hit line rate so im not sure why you are seeign such low performance.

> 
> ~S
> 
> 
> On Thu, Nov 26, 2020 at 8:14 PM Sean Mooney <smooney at redhat.com> wrote:
> > 
> > On Thu, 2020-11-26 at 16:56 -0500, Satish Patel wrote:
> > > Folks,
> > > 
> > > I am playing with DPDK on my openstack with NIC model 82599 and seeing
> > > poor performance, i may be wrong with my numbers so want to see what
> > > the community thinks about these results.
> > > 
> > > Compute node hardware:
> > > 
> > > CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
> > > Memory: 64G
> > > NIC: Intel 82599 (dual 10G port)
> > > 
> > > [root at compute-lxb-3 ~]# ovs-vswitchd --version
> > > ovs-vswitchd (Open vSwitch) 2.13.2
> > > DPDK 19.11.3
> > > 
> > > VM dpdk (DUT):
> > > 8vCPU / 8GB memory
> > > 
> > > I have configured my computer node for all best practice available on
> > > the internet to get more performance out.
> > > 
> > > 1. Used isolcpus to islocate CPUs
> > > 2. 4 dedicated core for PMD
> > > 3. echo isolated_cores=1,9,25,33 >> /etc/tuned/cpu-partitioning-variables.conf
> > > 4. Huge pages
> > > 5. CPU pinning for VM
> > > 6. increase  ( ovs-vsctl set interface dpdk-1 options:n_rxq=4 )
> > > 7. VM virtio_ring = 1024
> > > 
> > > After doing all above I am getting the following result using the Trex
> > > packet generator using 64B UDP stream (Total-PPS       :     391.93
> > > Kpps)  Do you think it's an acceptable result or should it be higher
> > > on these NIC models?
> > that is one of inteles oldest generation 10G nics that is supported by dpdk
> > 
> > but it shoudl still get to about 11 million packet per second with 1-2 cores
> > 
> > my guess would be that the vm or trafic gerneator are not sentding and reciving mac learnign
> > frames like arp properly and as a result the packtes are flooding which will severly
> > reduce perfomance.
> > > 
> > > On the internet folks say it should be a million packets per second so
> > > not sure what and how those people reached there or i am missing
> > > something in my load test profile.
> > 
> > even kernel ovs will break a million packets persecond so 400Kpps is far to low
> > there is sometin gmisconfigred but  im not sure what specificly form what you have shared.
> > as i said my best guess would be that the backets are flooding because the vm is not
> > responding to arp and the normal action is not learn the mac address.
> > 
> > you could rule that out by adding hardcoded rules but you could also check the flow tables to confirm
> > > 
> > > Notes: I am using 8vCPU core on VM do you think adding more cores will
> > > help? OR should i add more PMD?
> > > 
> > > Cpu Utilization : 2.2  %  1.8 Gb/core
> > >  Platform_factor : 1.0
> > >  Total-Tx        :     200.67 Mbps
> > >  Total-Rx        :     200.67 Mbps
> > >  Total-PPS       :     391.93 Kpps
> > >  Total-CPS       :     391.89 Kcps
> > > 
> > >  Expected-PPS    :     700.00 Kpps
> > >  Expected-CPS    :     700.00 Kcps
> > >  Expected-BPS    :     358.40 Mbps
> > > 
> > > 
> > > This is my all configuration:
> > > 
> > > grub.conf:
> > > GRUB_CMDLINE_LINUX="vmalloc=384M crashkernel=auto
> > > rd.lvm.lv=rootvg01/lv01 console=ttyS1,118200 rhgb quiet intel_iommu=on
> > > iommu=pt spectre_v2=off nopti pti=off nospec_store_bypass_disable
> > > spec_store_bypass_disable=off l1tf=off default_hugepagesz=1GB
> > > hugepagesz=1G hugepages=60 transparent_hugepage=never selinux=0
> > > isolcpus=2,3,4,5,6,7,10,11,12,13,14,15,26,27,28,29,30,31,34,35,36,37,38,39"
> > > 
> > > 
> > > [root at compute-lxb-3 ~]# ovs-appctl dpif/show
> > > netdev at ovs-netdev: hit:605860720 missed:2129
> > >   br-int:
> > >     br-int 65534/3: (tap)
> > >     int-br-vlan 1/none: (patch: peer=phy-br-vlan)
> > >     patch-tun 2/none: (patch: peer=patch-int)
> > >     vhu1d64ea7d-d9 5/6: (dpdkvhostuserclient: configured_rx_queues=8,
> > > configured_tx_queues=8, mtu=1500, requested_rx_queues=8,
> > > requested_tx_queues=8)
> > >     vhu9c32faf6-ac 6/7: (dpdkvhostuserclient: configured_rx_queues=8,
> > > configured_tx_queues=8, mtu=1500, requested_rx_queues=8,
> > > requested_tx_queues=8)
> > >   br-tun:
> > >     br-tun 65534/4: (tap)
> > >     patch-int 1/none: (patch: peer=patch-tun)
> > >     vxlan-0a410071 2/5: (vxlan: egress_pkt_mark=0, key=flow,
> > > local_ip=10.65.0.114, remote_ip=10.65.0.113)
> > >   br-vlan:
> > >     br-vlan 65534/1: (tap)
> > >     dpdk-1 2/2: (dpdk: configured_rx_queues=4,
> > > configured_rxq_descriptors=2048, configured_tx_queues=5,
> > > configured_txq_descriptors=2048, lsc_interrupt_mode=false, mtu=1500,
> > > requested_rx_queues=4, requested_rxq_descriptors=2048,
> > > requested_tx_queues=5, requested_txq_descriptors=2048,
> > > rx_csum_offload=true, tx_tso_offload=false)
> > >     phy-br-vlan 1/none: (patch: peer=int-br-vlan)
> > > 
> > > 
> > > [root at compute-lxb-3 ~]# ovs-appctl dpif-netdev/pmd-rxq-show
> > > pmd thread numa_id 0 core_id 1:
> > >   isolated : false
> > >   port: dpdk-1            queue-id:  0 (enabled)   pmd usage:  0 %
> > >   port: vhu1d64ea7d-d9    queue-id:  3 (enabled)   pmd usage:  0 %
> > >   port: vhu1d64ea7d-d9    queue-id:  4 (enabled)   pmd usage:  0 %
> > >   port: vhu9c32faf6-ac    queue-id:  3 (enabled)   pmd usage:  0 %
> > >   port: vhu9c32faf6-ac    queue-id:  4 (enabled)   pmd usage:  0 %
> > > pmd thread numa_id 0 core_id 9:
> > >   isolated : false
> > >   port: dpdk-1            queue-id:  1 (enabled)   pmd usage:  0 %
> > >   port: vhu1d64ea7d-d9    queue-id:  2 (enabled)   pmd usage:  0 %
> > >   port: vhu1d64ea7d-d9    queue-id:  5 (enabled)   pmd usage:  0 %
> > >   port: vhu9c32faf6-ac    queue-id:  2 (enabled)   pmd usage:  0 %
> > >   port: vhu9c32faf6-ac    queue-id:  5 (enabled)   pmd usage:  0 %
> > > pmd thread numa_id 0 core_id 25:
> > >   isolated : false
> > >   port: dpdk-1            queue-id:  3 (enabled)   pmd usage:  0 %
> > >   port: vhu1d64ea7d-d9    queue-id:  0 (enabled)   pmd usage:  0 %
> > >   port: vhu1d64ea7d-d9    queue-id:  7 (enabled)   pmd usage:  0 %
> > >   port: vhu9c32faf6-ac    queue-id:  0 (enabled)   pmd usage:  0 %
> > >   port: vhu9c32faf6-ac    queue-id:  7 (enabled)   pmd usage:  0 %
> > > pmd thread numa_id 0 core_id 33:
> > >   isolated : false
> > >   port: dpdk-1            queue-id:  2 (enabled)   pmd usage:  0 %
> > >   port: vhu1d64ea7d-d9    queue-id:  1 (enabled)   pmd usage:  0 %
> > >   port: vhu1d64ea7d-d9    queue-id:  6 (enabled)   pmd usage:  0 %
> > >   port: vhu9c32faf6-ac    queue-id:  1 (enabled)   pmd usage:  0 %
> > >   port: vhu9c32faf6-ac    queue-id:  6 (enabled)   pmd usage:  0 %
> > > 
> > 
> > 
> > 
> 





More information about the openstack-discuss mailing list