[neutron]How to ease congestion at neutron server nodes?

Sean Mooney smooney at redhat.com
Fri Jan 18 21:59:32 UTC 2019


On Fri, 2019-01-18 at 13:16 -0500, shubjero wrote:
> Hi Simon and Sean,
> 
> What amount of throughput are you able to get through your network nodes?

that will very wildly dependign on your nics and the networking solution you deployed.

ovs? with or without dpdk? with or without an sdn contoller? with or without hardware offload?
vpp? calico? linux bridge?

maybe some operators can share there experience.


>  At what point do you bottleneck north-south traffic and where is the bottleneck?
ill assume you are using kernel ovs.
with kenerl ovs you bottelneck will likely be ovs and possible the kernel routing stack.
kernel ovs can only handel about 1.8mpps in L2 phy to phy switching 
maybe a bit more depending on you cpu frequency and kernel version.
i have not been following this metric for ovs that closely but its somewhere in that neighbourhood. 

that is enough to switch about 1.2Gbps of 64k packets but can saturage a 10G link at ~ 512b packets.
unless you are using juboframes or some nic offloads you cannot to my knowlage satuage a 40G link with kernel ovs.
not when i say nic offload i am not refering to hardware offload ovs i am refering to GSO LRO and other offloads
you enable via ethtool.

the kernel ovs can switch in excess of 10Gbps of through put quite easy with standard mtu packets.
1500 MTU packet rate calculated as: (10*10^9) bits/sec / (1538 bytes * 8) = 812,744 pps
for 40Gbps it would go up to 2.6mpps which is more then kernel ovs can forward on a standard server cpu frequecies
int 2.4-3.6GHz range.

if you are using 9K jumboframes that packet forwardign rate drops too 442,576 pps which again is well within
kernel ovs ability to forward. the packet classifcation and header extration is much more costly then copying
the packet payload so pps is more important then the size of the packet but they are related.

so depending on your trafic profile kernel ovs may or may not be a bottleneck.

the next bottelneck is the kernel routing speed.
the linux kernel i pretty good at routing but in the neutron case
it not only has to do routing but also nat. the ip table snat and dnat actions
are likely to be the next bottelneck after ovs.

> Can you elaborate more on the multi-queue configuration?
i belive simon was refering to enableing multiple rx/tx ques on the nic attached to 
ovs such that the nics recive side scaling feature be used to hash packet into
a set of hardware recive queue which can then be processed by ovs using multiple kernel threads
across several cpus.
> 
> We have 3 controllers where the neutron server/api/L3 agents run and the active L3 agent for a particular neutron
> router's will drive the controllers CPU interrupts through the roof (400k) at which point it begins to cause 
one way to scale interrupt handling is irqblance assuming your nic supprot interupt steering but most if not all 10G
nics do.
> instability (failovers) amongst all L3 agents on that controller node well before we come close to saturating the
> 40Gbps (4x10Gbps lacp bond) of available bandwidth to it.




More information about the openstack-discuss mailing list