[openstack-dev] [neutron] Neutron router and nf_conntrack performance problems

Stuart Fox stuart at demonware.net
Sat Aug 16 16:12:48 UTC 2014


Hey neutron dev!

Im having a serious problem with my neutron router getting spin locked in
nf_conntrack_tuple_taken.
Has anybody else experienced it?
"perf top" shows nf_conntrack_tuple_taken at 75%
As the incoming request rate goes up, so nf_conntrack_tuple_taken runs very
hot on CPU0 causing ksoftirqd/0 to run at 100%. At that point internal
pings on the GRE network go sky high and its game over. Pinging from a vm
to the subnet default gateway on the neutron goes from 0.2ms to 11s!
pinging from the same vm to another vm in the same subnet stays constant at
0.2ms.

Very much indicates to me that the neutron router is having serious
problems.
No other part of the system seems under pressure.

ipv6 is disabled, and nf_conntrack_max/nf_conntrack_hash are set to 256k.
We've tried the default 3.13 and the utopic 3.16 kernel (3.16 has lots of
work on removing spinlocks around nf_conntrack). 3.16 survives a little
longer but still gets in the same state

Neutron router
1 x Ubuntu 14.04/Icehouse 2014.1.1 on an ibm x3550 with 4 10G intel nics.
eth0 - Mgt
eth1 - GRE
eth2 - Public
eth3 - unused

Compute/controller nodes
43 x Ubuntu 14.04/Icehouse 2014.1.1 ibm x240 flex blades with 4 emulex nics
eth0 Mgt
eth2 GRE

Any help very much appreciated!
Replace the l2/l3 functions with hardware is very much an option if thats a
better solution.
Im running out of time before my client decides to stay on AWS.



BR,
Stuart
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140816/a534d631/attachment.html>


More information about the OpenStack-dev mailing list