[openstack-dev] [neutron][rootwrap] Performance considerations, sudo?
Miguel Angel Ajo
majopela at redhat.com
Wed Mar 5 14:42:54 UTC 2014
Hello,
Recently, I found a serious issue about network-nodes startup time,
neutron-rootwrap eats a lot of cpu cycles, much more than the processes
it's wrapping itself.
On a database with 1 public network, 192 private networks, 192
routers, and 192 nano VMs, with OVS plugin:
Network node setup time (rootwrap): 24 minutes
Network node setup time (sudo): 10 minutes
That's the time since you reboot a network node, until all namespaces
and services are restored.
If you see appendix "1", this extra 14min overhead, matches with the
fact that rootwrap needs 0.3s to start, and launch a system command
(once filtered).
14minutes = 840 s.
(840s. / 192 resources)/0.3s ~= 15 operations /
resource(qdhcp+qrouter) (iptables, ovs port creation & tagging, starting
child processes, etc..)
The overhead comes from python startup time + rootwrap loading.
I suppose that rootwrap was designed for lower amount of system
calls (nova?).
And, I understand what rootwrap provides, a level of filtering that
sudo cannot offer. But it raises some question:
1) It's actually someone using rootwrap in production?
2) What alternatives can we think about to improve this situation.
0) already being done: coalescing system calls. But I'm unsure
that's enough. (if we coalesce 15 calls to 3 on this system we get:
192*3*0.3/60 ~=3 minutes overhead on a 10min operation).
a) Rewriting rules into sudo (to the extent that it's possible), and
live with that.
b) How secure is neutron about command injection to that point? How
much is user input filtered on the API calls?
c) Even if "b" is ok , I suppose that if the DB gets compromised,
that could lead to command injection.
d) Re-writing rootwrap into C (it's 600 python LOCs now).
e) Doing the command filtering at neutron-side, as a library and
live with sudo with simple filtering. (we kill the python/rootwrap
startup overhead).
3) I also find 10 minutes a long time to setup 192 networks/basic tenant
structures, I wonder if that time could be reduced by conversion
of system process calls into system library calls (I know we don't have
libraries for iproute, iptables?, and many other things... but it's a
problem that's probably worth looking at.)
Best,
Miguel Ángel Ajo.
Appendix:
[1] Analyzing overhead:
[root at rhos4-neutron2 ~]# echo "int main() { return 0; }" > test.c
[root at rhos4-neutron2 ~]# gcc test.c -o test
[root at rhos4-neutron2 ~]# time test # to time process invocation on
this machine
real 0m0.000s
user 0m0.000s
sys 0m0.000s
[root at rhos4-neutron2 ~]# time sudo bash -c 'exit 0'
real 0m0.032s
user 0m0.010s
sys 0m0.019s
[root at rhos4-neutron2 ~]# time python -c'import sys;sys.exit(0)'
real 0m0.057s
user 0m0.016s
sys 0m0.011s
[root at rhos4-neutron2 ~]# time neutron-rootwrap --help
/usr/bin/neutron-rootwrap: No command specified
real 0m0.309s
user 0m0.128s
sys 0m0.037s
More information about the OpenStack-dev
mailing list