[openstack-dev] [Neutron] OVS flow modification performance

IWAMOTO Toshihiro iwamoto at valinux.co.jp
Thu Feb 4 04:56:33 UTC 2016


At Sat, 30 Jan 2016 02:08:55 +0000,
Wuhongning wrote:
> 
> By our testing, ryu openflow has greatly improved the performance, with 500 port vxlan flow table, from 15s to 2.5s, 6 times better.

That's quite a impressive number.
What tests did you do?  Could you share some details?

Also, although unlikely, but please make sure your measurements aren't
affected by https://bugs.launchpad.net/neutron/+bug/1538368 .


> ________________________________________
> From: IWAMOTO Toshihiro [iwamoto at valinux.co.jp]
> Sent: Monday, January 25, 2016 5:08 PM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [Neutron] OVS flow modification performance
> 
> At Thu, 21 Jan 2016 02:59:16 +0000,
> Wuhongning wrote:
> >
> > I don't think 400 flows can show the difference , do you have setup any tunnel peer?
> >
> > In fact we may set the network type as "vxlan", then make a fake MD simulate sending l2pop fdb add messages, to push ten's of thousands flows into the testing ovs agent.
> 
> I chose this method because I didn't want to write such extra code for
> measurements. ;)
> Of course, I'd love to see data from other test environments and other
> workload than agent restarts.
> 
> Also, we now have https://review.openstack.org/#/c/271939/ and can
> profile neutron-server (and probably others, too).
> I couldn't find non-trivial findings until now, though.
> 
> > ________________________________________
> > From: IWAMOTO Toshihiro [iwamoto at valinux.co.jp]
> > Sent: Monday, January 18, 2016 4:37 PM
> > To: OpenStack Development Mailing List (not for usage questions)
> > Subject: Re: [openstack-dev] [Neutron] OVS flow modification performance
> >
> > At Mon, 18 Jan 2016 00:42:32 -0500,
> > Kevin Benton wrote:
> > >
> > > Thanks for doing this. A couple of questions:
> > >
> > > What were your rootwrap settings when running these tests? Did you just
> > > have it calling sudo directly?
> >
> > I used devstack's default, which runs root_helper_daemon.
> >
> > > Also, you mention that this is only ~10% of the time spent during flow
> > > reconfiguration. What other areas are eating up so much time?
> >
> >
> > In another run,
> >
> > $ for f in `cat tgidlist.n2`; do echo -n $f; opreport -n tgid:$f --merge tid|head -1|tr -d '\n'; (cd bg; opreport -n tgid:$f --merge tid|head -1);echo; done|sort -nr -k +2
> > 10071   239058 100.000 python2.7    14922 100.000 python2.7
> > 9995    92328 100.000 python2.7    11450 100.000 python2.7
> > 7579    88202 100.000 python2.7    (18596)
> > 11094    51560 100.000 python2.7    47964 100.000 python2.7
> > 7035    49687 100.000 python2.7    40678 100.000 python2.7
> > 11093    49380 100.000 python2.7    36004 100.000 python2.7
> > (legend: <pid> <oprof count with an agent restart> <junk> <junk>
> >          <background (oprof count without an agent restart)>)
> >
> > These processes are neutron-server, nova-api,
> > neutron-openvswitch-agent, nova-conductor, dstat and nova-conductor in
> > a decending order.
> >
> > So neutron-server uses about 3x CPU time than the ovs agent,
> > nova-api's CPU usage is similar to the ovs agent's, and the others
> > aren't probably significant.
> >
> > > Cheers,
> > > Kevin Benton
> > >
> > > On Sun, Jan 17, 2016 at 10:12 PM, IWAMOTO Toshihiro <iwamoto at valinux.co.jp>
> > > wrote:
> > >
> > > > I'm sending out this mail to share the finding and discuss how to
> > > > improve with those interested in neutron ovs performance.
> > > >
> > > > TL;DR: The native of_interface code, which has been merged recently
> > > > and isn't default, seems to consume less CPU time but gives a mixed
> > > > result.  I'm looking into this for improvement.
> > > >
> > > > * Introduction
> > > >
> > > > With an ML2+ovs Neutron configuration, openflow rule modification
> > > > happens often and is somewhat a heavy operation as it involves
> > > > exec() of the ovs-ofctl command.
> > > >
> > > > The native of_interface driver doesn't use the ovs-ofctl command and
> > > > should have less performance impact on the system.  This document
> > > > tries to confirm this hypothesis.
> > > >
> > > >
> > > > * Method
> > > >
> > > > In order to focus on openflow rule operation time and avoid noise from
> > > > other operations (VM boot-up, etc.), neutron-openvswitch-agent was
> > > > restarted and the time it took to reconfigure the flows was measured.
> > > >
> > > > 1. Use devstack to start a test environment.  As debug logs generate
> > > >    considable amount of load, ENABLE_DEBUG_LOG_LEVEL was set to false.
> > > > 2. Apply https://review.openstack.org/#/c/267905/ to enable
> > > >    measurement of flow reconfiguration times.
> > > > 3. Boot 80 m1.nano instances.  In my setup, this generates 404 br-int
> > > >    flows.  If you have >16G RAM, more could be booted.
> > > > 4. Stop neutron-openvswitch-agent and restart with --run-once arg.
> > > >    Use time, oprofile, and python's cProfile (use --profile arg) to
> > > >    collect data.
> > > >
> > > > * Results
> > > >
> > > > Execution time (averages of 3 runs):
> > > >
> > > >             native     28.3s user 2.9s sys 0.4s
> > > >             ovs-ofctl  25.7s user 2.2s sys 0.3s
> > > >
> > > > ovs-ofctl runs faster and seems to use less CPU, but the above doesn't
> > > > count in execution time of ovs-ofctl.
> > > >
> > > > Oprofile data collected by running "operf -s -t" contain the
> > > > information.
> > > >
> > > > With of_interface=native config, "opreport tgid:<pid of ovs agent>" shows:
> > > >
> > > >    samples|      %|
> > > > ------------------
> > > >     87408 100.000 python2.7
> > > >         CPU_CLK_UNHALT...|
> > > >           samples|      %|
> > > >         ------------------
> > > >             69160 79.1232 python2.7
> > > >              8416  9.6284 vmlinux-3.13.0-24-generic
> > > >
> > > > and "opreport --merge tgid" doesn't show ovs-ofctl.
> > > >
> > > > With of_interface=ovs-ofctl, "opreport tgid:<pid of ovs agent>" shows:
> > > >
> > > >    samples|      %|
> > > > ------------------
> > > >     62771 100.000 python2.7
> > > >         CPU_CLK_UNHALT...|
> > > >           samples|      %|
> > > >         ------------------
> > > >             49418 78.7274 python2.7
> > > >              6483 10.3280 vmlinux-3.13.0-24-generic
> > > >
> > > > and  "opreport --merge tgid" shows CPU consumption by ovs-ofctl
> > > >
> > > >     35774  3.5979 ovs-ofctl
> > > >         CPU_CLK_UNHALT...|
> > > >           samples|      %|
> > > >         ------------------
> > > >             28219 78.8813 vmlinux-3.13.0-24-generic
> > > >              3487  9.7473 ld-2.19.so
> > > >              2301  6.4320 ovs-ofctl
> > > >
> > > > Comparing 87408 (native python) with 62771+35774, the native
> > > > of_interface uses 0.4s less CPU time overall.
> > > >
> > > > * Conclusion and future steps
> > > >
> > > > The native of_interface uses slightly less CPU time but takes longer
> > > > time to complete a flow reconfiguration after an agent restart.
> > > >
> > > > As an OVS agent accounts for only 1/10th of total CPU usage during a
> > > > flow reconfiguration (data not shown), there may be other areas for
> > > > improvement.
> > > >
> > > > The cProfile Python module gives more fine grained data, but no
> > > > apparent performance bottleneck was found.  The data show more
> > > > eventlet context switches with the native of_interface, which is due
> > > > to how the native of_interface is written.  I'm looking into for
> > > > improving CPU usage and latency.
> > > >
> > > >
> > > >
> > > > __________________________________________________________________________
> > > > OpenStack Development Mailing List (not for usage questions)
> > > > Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> > > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > > >
> > >
> > >
> > >
> > > --
> > > Kevin Benton
> > > [1.2  <text/html; UTF-8 (quoted-printable)>]
> > > [2  <text/plain; us-ascii (7bit)>]
> > > __________________________________________________________________________
> > > OpenStack Development Mailing List (not for usage questions)
> > > Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> > __________________________________________________________________________
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> > __________________________________________________________________________
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 



More information about the OpenStack-dev mailing list