[openstack-dev] [Neutron] OVS flow modification performance
IWAMOTO Toshihiro
iwamoto at valinux.co.jp
Mon Jan 18 03:12:28 UTC 2016
I'm sending out this mail to share the finding and discuss how to
improve with those interested in neutron ovs performance.
TL;DR: The native of_interface code, which has been merged recently
and isn't default, seems to consume less CPU time but gives a mixed
result. I'm looking into this for improvement.
* Introduction
With an ML2+ovs Neutron configuration, openflow rule modification
happens often and is somewhat a heavy operation as it involves
exec() of the ovs-ofctl command.
The native of_interface driver doesn't use the ovs-ofctl command and
should have less performance impact on the system. This document
tries to confirm this hypothesis.
* Method
In order to focus on openflow rule operation time and avoid noise from
other operations (VM boot-up, etc.), neutron-openvswitch-agent was
restarted and the time it took to reconfigure the flows was measured.
1. Use devstack to start a test environment. As debug logs generate
considable amount of load, ENABLE_DEBUG_LOG_LEVEL was set to false.
2. Apply https://review.openstack.org/#/c/267905/ to enable
measurement of flow reconfiguration times.
3. Boot 80 m1.nano instances. In my setup, this generates 404 br-int
flows. If you have >16G RAM, more could be booted.
4. Stop neutron-openvswitch-agent and restart with --run-once arg.
Use time, oprofile, and python's cProfile (use --profile arg) to
collect data.
* Results
Execution time (averages of 3 runs):
native 28.3s user 2.9s sys 0.4s
ovs-ofctl 25.7s user 2.2s sys 0.3s
ovs-ofctl runs faster and seems to use less CPU, but the above doesn't
count in execution time of ovs-ofctl.
Oprofile data collected by running "operf -s -t" contain the
information.
With of_interface=native config, "opreport tgid:<pid of ovs agent>" shows:
samples| %|
------------------
87408 100.000 python2.7
CPU_CLK_UNHALT...|
samples| %|
------------------
69160 79.1232 python2.7
8416 9.6284 vmlinux-3.13.0-24-generic
and "opreport --merge tgid" doesn't show ovs-ofctl.
With of_interface=ovs-ofctl, "opreport tgid:<pid of ovs agent>" shows:
samples| %|
------------------
62771 100.000 python2.7
CPU_CLK_UNHALT...|
samples| %|
------------------
49418 78.7274 python2.7
6483 10.3280 vmlinux-3.13.0-24-generic
and "opreport --merge tgid" shows CPU consumption by ovs-ofctl
35774 3.5979 ovs-ofctl
CPU_CLK_UNHALT...|
samples| %|
------------------
28219 78.8813 vmlinux-3.13.0-24-generic
3487 9.7473 ld-2.19.so
2301 6.4320 ovs-ofctl
Comparing 87408 (native python) with 62771+35774, the native
of_interface uses 0.4s less CPU time overall.
* Conclusion and future steps
The native of_interface uses slightly less CPU time but takes longer
time to complete a flow reconfiguration after an agent restart.
As an OVS agent accounts for only 1/10th of total CPU usage during a
flow reconfiguration (data not shown), there may be other areas for
improvement.
The cProfile Python module gives more fine grained data, but no
apparent performance bottleneck was found. The data show more
eventlet context switches with the native of_interface, which is due
to how the native of_interface is written. I'm looking into for
improving CPU usage and latency.
More information about the OpenStack-dev
mailing list