[openstack-dev] [neutron][rootwrap] Performance considerations, sudo?

Rick Jones rick.jones2 at hp.com
Wed Mar 5 16:58:47 UTC 2014


On 03/05/2014 06:42 AM, Miguel Angel Ajo wrote:
>
>      Hello,
>
>      Recently, I found a serious issue about network-nodes startup time,
> neutron-rootwrap eats a lot of cpu cycles, much more than the processes
> it's wrapping itself.
>
>      On a database with 1 public network, 192 private networks, 192
> routers, and 192 nano VMs, with OVS plugin:
>
>
> Network node setup time (rootwrap): 24 minutes
> Network node setup time (sudo):     10 minutes

I've not been looking at rootwrap, but have been looking at sudo and ip. 
(Using some scripts which create "fake routers" so I could look without 
any of this icky OpenStack stuff in the way :) ) The Ubuntu 12.04 
versions of each at least will enumerate all the interfaces on the 
system, even though they don't need to.

There was already an upstream change to 'ip' that eliminates the 
unnecessary enumeration.  In the last few weeks an enhancement went into 
the upstream sudo that allows one to configure sudo to not do the same 
thing.   Down in the low(ish) three figures of interfaces it may not be 
a Big Deal (tm) but as one starts to go beyond that...

commit f0124b0f0aa0e5b9288114eb8e6ff9b4f8c33ec8
Author: Stephen Hemminger <stephen at networkplumber.org>
Date:   Thu Mar 28 15:17:47 2013 -0700

     ip: remove unnecessary ll_init_map

     Don't call ll_init_map on modify operations
     Saves significant overhead with 1000's of devices.

http://www.sudo.ws/pipermail/sudo-workers/2014-January/000826.html

Whether your environment already has the 'ip' change I don't know, but 
odd are probably pretty good it doesn't have the sudo enhancement.

>     That's the time since you reboot a network node, until all namespaces
> and services are restored.

So, that includes the time for the system to go down and reboot, not 
just the time it takes to rebuild once rebuilding starts?

>     If you see appendix "1", this extra 14min overhead, matches with the
> fact that rootwrap needs 0.3s to start, and launch a system command
> (once filtered).
>
>      14minutes =  840 s.
>      (840s. / 192 resources)/0.3s ~= 15 operations /
> resource(qdhcp+qrouter) (iptables, ovs port creation & tagging, starting
> child processes, etc..)
>
>     The overhead comes from python startup time + rootwrap loading.

How much of the time is python startup time?  I assume that would be all 
the "find this lib, find that lib" stuff one sees in a system call 
trace?  I saw a boatload of that at one point but didn't quite feel like 
wading into that at the time.

>     I suppose that rootwrap was designed for lower amount of system
> calls (nova?).

And/or a smaller environment perhaps.

>     And, I understand what rootwrap provides, a level of filtering that
> sudo cannot offer. But it raises some question:
>
> 1) It's actually someone using rootwrap in production?
>
> 2) What alternatives can we think about to improve this situation.
>
>     0) already being done: coalescing system calls. But I'm unsure
> that's enough. (if we coalesce 15 calls to 3 on this system we get:
> 192*3*0.3/60 ~=3 minutes overhead on a 10min operation).

It may not be sufficient, but it is (IMO) certainly necessary.  It will 
make any work that minimizes or eliminates the overhead of rootwrap look 
that much better.

>     a) Rewriting rules into sudo (to the extent that it's possible), and
> live with that.
>     b) How secure is neutron about command injection to that point? How
> much is user input filtered on the API calls?
>     c) Even if "b" is ok , I suppose that if the DB gets compromised,
> that could lead to command injection.
>
>     d) Re-writing rootwrap into C (it's 600 python LOCs now).
>
>     e) Doing the command filtering at neutron-side, as a library and
> live with sudo with simple filtering. (we kill the python/rootwrap
> startup overhead).
>
> 3) I also find 10 minutes a long time to setup 192 networks/basic tenant
> structures, I wonder if that time could be reduced by conversion
> of system process calls into system library calls (I know we don't have
> libraries for iproute, iptables?, and many other things... but it's a
> problem that's probably worth looking at.)

Certainly going back and forth creating short-lived processes is at 
least anti-social and perhaps ever so slightly upsetting to the process 
scheduler.  Particularly "at scale."  The/a problem is though that the 
Linux networking folks have been somewhat reticent about creating 
libraries (at least any that they would end-up supporting) because they 
have a concern it will lock-in interfaces and reduce their freedom of 
movement.

happy benchmarking,

rick jones
the fastest procedure call is the one you never make

>
> Best,
> Miguel Ángel Ajo.
>
>
> Appendix:
>
> [1] Analyzing overhead:
>
> [root at rhos4-neutron2 ~]# echo "int main() { return 0; }" > test.c
> [root at rhos4-neutron2 ~]# gcc test.c -o test
> [root at rhos4-neutron2 ~]# time test      # to time process invocation on
> this machine
>
> real    0m0.000s
> user    0m0.000s
> sys    0m0.000s
>
>
> [root at rhos4-neutron2 ~]# time sudo bash -c 'exit 0'
>
> real    0m0.032s
> user    0m0.010s
> sys    0m0.019s
>
>
> [root at rhos4-neutron2 ~]# time python -c'import sys;sys.exit(0)'
>
> real    0m0.057s
> user    0m0.016s
> sys    0m0.011s
>
> [root at rhos4-neutron2 ~]# time neutron-rootwrap --help
> /usr/bin/neutron-rootwrap: No command specified
>
> real    0m0.309s
> user    0m0.128s
> sys    0m0.037s
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




More information about the OpenStack-dev mailing list