Open Stack

Mon Jan 18 03:12:28 UTC 2016

I'm sending out this mail to share the finding and discuss how to
improve with those interested in neutron ovs performance.

TL;DR: The native of_interface code, which has been merged recently
and isn't default, seems to consume less CPU time but gives a mixed
result.  I'm looking into this for improvement.

* Introduction

With an ML2+ovs Neutron configuration, openflow rule modification
happens often and is somewhat a heavy operation as it involves
exec() of the ovs-ofctl command.

The native of_interface driver doesn't use the ovs-ofctl command and
should have less performance impact on the system.  This document
tries to confirm this hypothesis.

* Method

In order to focus on openflow rule operation time and avoid noise from
other operations (VM boot-up, etc.), neutron-openvswitch-agent was
restarted and the time it took to reconfigure the flows was measured.

1. Use devstack to start a test environment.  As debug logs generate
   considable amount of load, ENABLE_DEBUG_LOG_LEVEL was set to false.
2. Apply https://review.openstack.org/#/c/267905/ to enable
   measurement of flow reconfiguration times.
3. Boot 80 m1.nano instances.  In my setup, this generates 404 br-int
   flows.  If you have >16G RAM, more could be booted.
4. Stop neutron-openvswitch-agent and restart with --run-once arg.
   Use time, oprofile, and python's cProfile (use --profile arg) to
   collect data.

* Results

Execution time (averages of 3 runs):

            native     28.3s user 2.9s sys 0.4s
            ovs-ofctl  25.7s user 2.2s sys 0.3s

ovs-ofctl runs faster and seems to use less CPU, but the above doesn't
count in execution time of ovs-ofctl.

Oprofile data collected by running "operf -s -t" contain the
information.

With of_interface=native config, "opreport tgid:<pid of ovs agent>" shows:

   samples|      %|
------------------
    87408 100.000 python2.7
	CPU_CLK_UNHALT...|
	  samples|      %|
	------------------
	    69160 79.1232 python2.7
	     8416  9.6284 vmlinux-3.13.0-24-generic

and "opreport --merge tgid" doesn't show ovs-ofctl.

With of_interface=ovs-ofctl, "opreport tgid:<pid of ovs agent>" shows:

   samples|      %|
------------------
    62771 100.000 python2.7
        CPU_CLK_UNHALT...|
          samples|      %|
        ------------------
            49418 78.7274 python2.7
             6483 10.3280 vmlinux-3.13.0-24-generic

and  "opreport --merge tgid" shows CPU consumption by ovs-ofctl 

    35774  3.5979 ovs-ofctl
        CPU_CLK_UNHALT...|
          samples|      %|
        ------------------
            28219 78.8813 vmlinux-3.13.0-24-generic
             3487  9.7473 ld-2.19.so
             2301  6.4320 ovs-ofctl

Comparing 87408 (native python) with 62771+35774, the native
of_interface uses 0.4s less CPU time overall.

* Conclusion and future steps

The native of_interface uses slightly less CPU time but takes longer
time to complete a flow reconfiguration after an agent restart.

As an OVS agent accounts for only 1/10th of total CPU usage during a
flow reconfiguration (data not shown), there may be other areas for
improvement.

The cProfile Python module gives more fine grained data, but no
apparent performance bottleneck was found.  The data show more
eventlet context switches with the native of_interface, which is due
to how the native of_interface is written.  I'm looking into for
improving CPU usage and latency.

Open Stack

[openstack-dev] [Neutron] OVS flow modification performance

OpenStack

Community

Documentation

Branding & Legal