[openstack-dev] [Neutron] OVS flow modification performance
IWAMOTO Toshihiro
iwamoto at valinux.co.jp
Thu Apr 7 07:33:02 UTC 2016
At Mon, 18 Jan 2016 12:12:28 +0900,
IWAMOTO Toshihiro wrote:
>
> I'm sending out this mail to share the finding and discuss how to
> improve with those interested in neutron ovs performance.
>
> TL;DR: The native of_interface code, which has been merged recently
> and isn't default, seems to consume less CPU time but gives a mixed
> result. I'm looking into this for improvement.
I went on to look at implementation details of eventlet etc, but it
turned out to be fairly simple. The OVS agent in the
of_interface=native mode waits for a openflow connection from
ovs-vswitchd, which can take up to 5 seconds.
Please look at the attached graph.
The x-axis is time from agent restarts, the y-axis is numbers of ports
processed (in treat_devices and bind_devices). Each port is counted
twice; the first slope is treat_devices and the second is
bind_devices. The native of_interface needs some more time on
start-up, but bind_devices is about 2x faster.
The data was collected with 160 VMs with the devstack default settings.
> * Introduction
>
> With an ML2+ovs Neutron configuration, openflow rule modification
> happens often and is somewhat a heavy operation as it involves
> exec() of the ovs-ofctl command.
>
> The native of_interface driver doesn't use the ovs-ofctl command and
> should have less performance impact on the system. This document
> tries to confirm this hypothesis.
>
>
> * Method
>
> In order to focus on openflow rule operation time and avoid noise from
> other operations (VM boot-up, etc.), neutron-openvswitch-agent was
> restarted and the time it took to reconfigure the flows was measured.
>
> 1. Use devstack to start a test environment. As debug logs generate
> considable amount of load, ENABLE_DEBUG_LOG_LEVEL was set to false.
> 2. Apply https://review.openstack.org/#/c/267905/ to enable
> measurement of flow reconfiguration times.
> 3. Boot 80 m1.nano instances. In my setup, this generates 404 br-int
> flows. If you have >16G RAM, more could be booted.
> 4. Stop neutron-openvswitch-agent and restart with --run-once arg.
> Use time, oprofile, and python's cProfile (use --profile arg) to
> collect data.
>
> * Results
>
> Execution time (averages of 3 runs):
>
> native 28.3s user 2.9s sys 0.4s
> ovs-ofctl 25.7s user 2.2s sys 0.3s
>
> ovs-ofctl runs faster and seems to use less CPU, but the above doesn't
> count in execution time of ovs-ofctl.
With 160 VMs and debug=false for the OVS agent and the neutron-server,
Execution time (averages and SDs of 10 runs):
native 56.4+-3.4s user 8.7+-0.1s sys 0.82+-0.04s
ovs-ofctl 55.9+-1.0s user 6.9+-0.08s sys 0.67+-0.05s
To exclude the openflow connection waits,
times between log outputs of "Loaded agent extensions" and
"Configuration for devices up completed" is also compared:
native 48.2+-0.49s
ovs-ofctl 53.2+-0.99s
The native of_interface is the clear winner.
> Oprofile data collected by running "operf -s -t" contain the
> information.
>
> With of_interface=native config, "opreport tgid:<pid of ovs agent>" shows:
>
> samples| %|
> ------------------
> 87408 100.000 python2.7
> CPU_CLK_UNHALT...|
> samples| %|
> ------------------
> 69160 79.1232 python2.7
> 8416 9.6284 vmlinux-3.13.0-24-generic
>
> and "opreport --merge tgid" doesn't show ovs-ofctl.
>
> With of_interface=ovs-ofctl, "opreport tgid:<pid of ovs agent>" shows:
>
> samples| %|
> ------------------
> 62771 100.000 python2.7
> CPU_CLK_UNHALT...|
> samples| %|
> ------------------
> 49418 78.7274 python2.7
> 6483 10.3280 vmlinux-3.13.0-24-generic
>
> and "opreport --merge tgid" shows CPU consumption by ovs-ofctl
>
> 35774 3.5979 ovs-ofctl
> CPU_CLK_UNHALT...|
> samples| %|
> ------------------
> 28219 78.8813 vmlinux-3.13.0-24-generic
> 3487 9.7473 ld-2.19.so
> 2301 6.4320 ovs-ofctl
>
> Comparing 87408 (native python) with 62771+35774, the native
> of_interface uses 0.4s less CPU time overall.
>
> * Conclusion and future steps
>
> The native of_interface uses slightly less CPU time but takes longer
> time to complete a flow reconfiguration after an agent restart.
>
> As an OVS agent accounts for only 1/10th of total CPU usage during a
> flow reconfiguration (data not shown), there may be other areas for
> improvement.
>
> The cProfile Python module gives more fine grained data, but no
> apparent performance bottleneck was found. The data show more
> eventlet context switches with the native of_interface, which is due
> to how the native of_interface is written. I'm looking into for
> improving CPU usage and latency.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: of_int-comparison.pdf
Type: application/pdf
Size: 33838 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160407/7bbae1d8/attachment.pdf>
More information about the OpenStack-dev
mailing list