[openstack-dev] [neutron][dvr] Removing fip namespace when restarting L3 agent.

Korzeniewski, Artur artur.korzeniewski at intel.com
Fri Aug 7 07:24:38 UTC 2015


Bug submitted:
https://bugs.launchpad.net/neutron/+bug/1482521

Thanks,
Artur

From: Oleg Bondarev [mailto:obondarev at mirantis.com]
Sent: Thursday, August 6, 2015 5:18 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron][dvr] Removing fip namespace when restarting L3 agent.



On Thu, Aug 6, 2015 at 5:23 PM, Korzeniewski, Artur <artur.korzeniewski at intel.com<mailto:artur.korzeniewski at intel.com>> wrote:
Thanks Kevin for that hint.
But it does not resolve the connectivity problem, it is just not removing the namespace when it is asked to.
The real question is, why do we invoke the /neutron/neutron/agent/l3/dvr_fip_ns.py FipNamespace.delete() method in the first place?

I’ve captured the traceback for this situation:
2015-08-06 06:35:28.469 DEBUG neutron.agent.linux.utils [-] Unable to access /opt/openstack/data/neutron/external/pids/8223e12e-837b-49d4-9793-63603fccbc9f.pid from (pid=70216) get_value_from_file /opt/openstack/neutron/neutron/agent/linux/utils.py:222
2015-08-06 06:35:28.469 DEBUG neutron.agent.linux.utils [-] Unable to access /opt/openstack/data/neutron/external/pids/8223e12e-837b-49d4-9793-63603fccbc9f.pid from (pid=70216) get_value_from_file /opt/openstack/neutron/neutron/agent/linux/utils.py:222
2015-08-06 06:35:28.469 DEBUG neutron.agent.linux.external_process [-] No process started for 8223e12e-837b-49d4-9793-63603fccbc9f from (pid=70216) disable /opt/openstack/neutron/neutron/agent/linux/external_process.py:113
Traceback (most recent call last):
 File "/usr/local/lib/python2.7/dist-packages/eventlet/queue.py", line 117, in switch
    self.greenlet.switch(value)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main
    result = function(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/oslo_service/service.py", line 612, in run_service
    service.start()
  File "/opt/openstack/neutron/neutron/service.py", line 233, in start
    self.manager.after_start()
  File "/opt/openstack/neutron/neutron/agent/l3/agent.py", line 641, in after_start
    self.periodic_sync_routers_task(self.context)
  File "/opt/openstack/neutron/neutron/agent/l3/agent.py", line 519, in periodic_sync_routers_task
    self.fetch_and_sync_all_routers(context, ns_manager)
  File "/opt/openstack/neutron/neutron/agent/l3/namespace_manager.py", line 91, in __exit__
    self._cleanup(_ns_prefix, ns_id)
  File "/opt/openstack/neutron/neutron/agent/l3/namespace_manager.py", line 140, in _cleanup
    ns.delete()
  File "/opt/openstack/neutron/neutron/agent/l3/dvr_fip_ns.py", line 147, in delete
    raise TypeError("ss")
TypeError: ss

It seems that the fip namespace is not processed at startup of L3 agent, and the cleanup is removing the namespace…
It is also removing the interface to local dvr router connection so… VM has no internet access with floating IP:
Command: ['ip', 'netns', 'exec', 'fip-8223e12e-837b-49d4-9793-63603fccbc9f', 'ip', 'link', 'del', u'fpr-fe517b4b-d']

If the interface inside the fip namespace is not deleted, the VM has full internet access without any downtime.

Ca we consider it a bug? I guess it is something in startup/full-sync logic since the log is saying:
/opt/openstack/data/neutron/external/pids/8223e12e-837b-49d4-9793-63603fccbc9f.pid

I think yes, we can consider it a bug. Can you please file one? I can take and probably fix it.


And after finishing the sync loop, the fip namespace is deleted…

Regards,
Artur

From: Kevin Benton [mailto:blak111 at gmail.com<mailto:blak111 at gmail.com>]
Sent: Thursday, August 6, 2015 7:40 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron][dvr] Removing fip namespace when restarting L3 agent.

Can you try setting the following to False:
https://github.com/openstack/neutron/blob/dc0944f2d4e347922054bba679ba7f5d1ae6ffe2/etc/l3_agent.ini#L97

On Wed, Aug 5, 2015 at 3:36 PM, Korzeniewski, Artur <artur.korzeniewski at intel.com<mailto:artur.korzeniewski at intel.com>> wrote:
Hi all,
During testing of Neutron upgrades, I have found that restarting the L3 agent in DVR mode is causing the VM network downtime for configured floating IP.
The lockdown is visible when pinging the VM from external network, 2-3 pings are lost.
The responsible place in code is:
DVR: destroy fip ns: fip-8223e12e-837b-49d4-9793-63603fccbc9f from (pid=156888) delete /opt/openstack/neutron/neutron/agent/l3/dvr_fip_ns.py:164

Can someone explain why the fip namespace is deleted? Can we workout the situation, when there is no downtime of VM access?

Artur Korzeniewski
--------------------------------------------
Intel Technology Poland sp. z o.o.
KRS 101882
ul. Slowackiego 173, 80-298 Gdansk


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe<http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--
Kevin Benton

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe<http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150807/b3b3f6e9/attachment.html>


More information about the OpenStack-dev mailing list