[openstack-dev] [neutron]OVS connection tracking cleanup

Ajay Kalambur (akalambu) akalambu at cisco.com
Tue Sep 12 16:30:54 UTC 2017


Hi Kevin
Sure will log a bug
Also does the config change involve having both these lines in the neutron.conf file?
[agent]
root_helper = sudo neutron-rootwrap /etc/neutron/rootwrap.conf
root_helper_daemon = sudo neutron-rootwrap-daemon /etc/neutron/rootwrap.conf

If I have only the second line I see the exception below on neutron openvswitch agent bring up:

2017-09-12 09:23:03.633 35 DEBUG neutron.agent.linux.utils [req-0f8fe685-66bd-44d7-beac-bb4c24f0ccfa - - - - -] Running command: ['ps', '--ppid', '103', '-o', 'pid='] create_process /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:89
2017-09-12 09:23:03.762 35 ERROR ryu.lib.hub [req-0f8fe685-66bd-44d7-beac-bb4c24f0ccfa - - - - -] hub: uncaught exception: Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ryu/lib/hub.py", line 54, in _launch
    return func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ovs_ryuapp.py", line 42, in agent_main_wrapper
    ovs_agent.main(bridge_classes)
  File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 2184, in main
    agent.daemon_loop()
  File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 154, in wrapper
    return f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 2100, in daemon_loop
    self.ovsdb_monitor_respawn_interval) as pm:
  File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File "/usr/lib/python2.7/site-packages/neutron/agent/linux/polling.py", line 35, in get_polling_manager
    pm.start()
  File "/usr/lib/python2.7/site-packages/neutron/agent/linux/polling.py", line 57, in start
    while not self.is_active():
  File "/usr/lib/python2.7/site-packages/neutron/agent/linux/async_process.py", line 100, in is_active
    self.pid, self.cmd_without_namespace)
  File "/usr/lib/python2.7/site-packages/neutron/agent/linux/async_process.py", line 159, in pid
    run_as_root=self.run_as_root)
  File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 297, in get_root_helper_child_pid
    pid = find_child_pids(pid)[0]
  File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 179, in find_child_pids
    log_fail_as_error=False)
  File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 128, in execute
    _stdout, _stderr = obj.communicate(_process_input)
  File "/usr/lib64/python2.7/subprocess.py", line 800, in communicate
    return self._communicate(input)
  File "/usr/lib64/python2.7/subprocess.py", line 1403, in _communicate
    stdout, stderr = self._communicate_with_select(input)
  File "/usr/lib64/python2.7/subprocess.py", line 1504, in _communicate_with_select
    rlist, wlist, xlist = select.select(read_set, write_set, [])
  File "/usr/lib/python2.7/site-packages/eventlet/green/select.py", line 86, in select
    return hub.switch()
  File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 294, in switch
    return self.greenlet.switch()
Timeout: 5 seconds

2017-09-12 09:23:03.860 35 INFO oslo_rootwrap.client [-] Stopping rootwrap daemon process with pid=95


Ajay



From: Kevin Benton <kevin at benton.pub<mailto:kevin at benton.pub>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Monday, September 11, 2017 at 1:12 PM
To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Cc: "Ian Wells (iawells)" <iawells at cisco.com<mailto:iawells at cisco.com>>
Subject: Re: [openstack-dev] [neutron]OVS connection tracking cleanup

Can you start a bug on launchpad and upload the conntrack attachment to the bug?

Switching to the rootwrap daemon should also help significantly.

On Mon, Sep 11, 2017 at 12:32 PM, Ajay Kalambur (akalambu) <akalambu at cisco.com<mailto:akalambu at cisco.com>> wrote:
Hi Kevin
The information you asked for
For 1 compute node with 45 Vms here is the number of connection tracking entries getting deleted
cat conntrack.file  | wc -l
   38528

The file with output is 14MB so ill email it to Ian and he can share it if needed

Security group rules
DirectionEther TypeIP ProtocolPort RangeRemote IP PrefixRemote Security GroupActions
EgressIPv4AnyAny0.0.0.0/0<http://0.0.0.0/0>
IngressIPv6AnyAny-default
EgressIPv6AnyAny::/0-
IngressIPv4AnyAny-

Please let me know if u need the dump of conntrack entries if so I can email it to email address of your choice


Ajay



From: Ajay Kalambur <akalambu at cisco.com<mailto:akalambu at cisco.com>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Monday, September 11, 2017 at 10:02 AM
To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject: Re: [openstack-dev] [neutron]OVS connection tracking cleanup

Hi Kevin
Thanks for your response it was about 50 vms
Ajay



On Sep 11, 2017, at 9:49 AM, Kevin Benton <kevin at benton.pub<mailto:kevin at benton.pub>> wrote:

The biggest improvement will be switching to native netlink calls: https://review.openstack.org/#/c/470912/

How many VMs were on a single compute node?

On Mon, Sep 11, 2017 at 9:15 AM, Ajay Kalambur (akalambu) <akalambu at cisco.com<mailto:akalambu at cisco.com>> wrote:
Hi
I am performing a scale test and I see that after creating 500 Vms with ping traffic between them it took almost 1 hr for the connection tracking
To clean up and ovs agent was busy doing this and unable to service any new port bind requests on some computes for almost an hr
It took that long for conntrack clean up to complete


I see the following bug
https://bugs.launchpad.net/neutron/+bug/1513765

And I also have the fix below
https://git.openstack.org/cgit/openstack/neutron/commit/?id=d7aeb8dd4b1d122e17eef8687192cd122b79fd6e


Still see really long times for conntrack cleanup

What is the solution to this problem in scale scenarios?
Ajay


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe<http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org<mailto:OpenStack-dev-request at lists.openstack.org>?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe<http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20170912/592a84da/attachment.html>


More information about the OpenStack-dev mailing list