[openstack-dev] [neutron][dvr] Removing fip namespace when restarting L3 agent.
Korzeniewski, Artur
artur.korzeniewski at intel.com
Fri Aug 7 07:24:38 UTC 2015
Bug submitted:
https://bugs.launchpad.net/neutron/+bug/1482521
Thanks,
Artur
From: Oleg Bondarev [mailto:obondarev at mirantis.com]
Sent: Thursday, August 6, 2015 5:18 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron][dvr] Removing fip namespace when restarting L3 agent.
On Thu, Aug 6, 2015 at 5:23 PM, Korzeniewski, Artur <artur.korzeniewski at intel.com<mailto:artur.korzeniewski at intel.com>> wrote:
Thanks Kevin for that hint.
But it does not resolve the connectivity problem, it is just not removing the namespace when it is asked to.
The real question is, why do we invoke the /neutron/neutron/agent/l3/dvr_fip_ns.py FipNamespace.delete() method in the first place?
I’ve captured the traceback for this situation:
2015-08-06 06:35:28.469 DEBUG neutron.agent.linux.utils [-] Unable to access /opt/openstack/data/neutron/external/pids/8223e12e-837b-49d4-9793-63603fccbc9f.pid from (pid=70216) get_value_from_file /opt/openstack/neutron/neutron/agent/linux/utils.py:222
2015-08-06 06:35:28.469 DEBUG neutron.agent.linux.utils [-] Unable to access /opt/openstack/data/neutron/external/pids/8223e12e-837b-49d4-9793-63603fccbc9f.pid from (pid=70216) get_value_from_file /opt/openstack/neutron/neutron/agent/linux/utils.py:222
2015-08-06 06:35:28.469 DEBUG neutron.agent.linux.external_process [-] No process started for 8223e12e-837b-49d4-9793-63603fccbc9f from (pid=70216) disable /opt/openstack/neutron/neutron/agent/linux/external_process.py:113
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/eventlet/queue.py", line 117, in switch
self.greenlet.switch(value)
File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main
result = function(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/oslo_service/service.py", line 612, in run_service
service.start()
File "/opt/openstack/neutron/neutron/service.py", line 233, in start
self.manager.after_start()
File "/opt/openstack/neutron/neutron/agent/l3/agent.py", line 641, in after_start
self.periodic_sync_routers_task(self.context)
File "/opt/openstack/neutron/neutron/agent/l3/agent.py", line 519, in periodic_sync_routers_task
self.fetch_and_sync_all_routers(context, ns_manager)
File "/opt/openstack/neutron/neutron/agent/l3/namespace_manager.py", line 91, in __exit__
self._cleanup(_ns_prefix, ns_id)
File "/opt/openstack/neutron/neutron/agent/l3/namespace_manager.py", line 140, in _cleanup
ns.delete()
File "/opt/openstack/neutron/neutron/agent/l3/dvr_fip_ns.py", line 147, in delete
raise TypeError("ss")
TypeError: ss
It seems that the fip namespace is not processed at startup of L3 agent, and the cleanup is removing the namespace…
It is also removing the interface to local dvr router connection so… VM has no internet access with floating IP:
Command: ['ip', 'netns', 'exec', 'fip-8223e12e-837b-49d4-9793-63603fccbc9f', 'ip', 'link', 'del', u'fpr-fe517b4b-d']
If the interface inside the fip namespace is not deleted, the VM has full internet access without any downtime.
Ca we consider it a bug? I guess it is something in startup/full-sync logic since the log is saying:
/opt/openstack/data/neutron/external/pids/8223e12e-837b-49d4-9793-63603fccbc9f.pid
I think yes, we can consider it a bug. Can you please file one? I can take and probably fix it.
And after finishing the sync loop, the fip namespace is deleted…
Regards,
Artur
From: Kevin Benton [mailto:blak111 at gmail.com<mailto:blak111 at gmail.com>]
Sent: Thursday, August 6, 2015 7:40 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron][dvr] Removing fip namespace when restarting L3 agent.
Can you try setting the following to False:
https://github.com/openstack/neutron/blob/dc0944f2d4e347922054bba679ba7f5d1ae6ffe2/etc/l3_agent.ini#L97
On Wed, Aug 5, 2015 at 3:36 PM, Korzeniewski, Artur <artur.korzeniewski at intel.com<mailto:artur.korzeniewski at intel.com>> wrote:
Hi all,
During testing of Neutron upgrades, I have found that restarting the L3 agent in DVR mode is causing the VM network downtime for configured floating IP.
The lockdown is visible when pinging the VM from external network, 2-3 pings are lost.
The responsible place in code is:
DVR: destroy fip ns: fip-8223e12e-837b-49d4-9793-63603fccbc9f from (pid=156888) delete /opt/openstack/neutron/neutron/agent/l3/dvr_fip_ns.py:164
Can someone explain why the fip namespace is deleted? Can we workout the situation, when there is no downtime of VM access?
Artur Korzeniewski
--------------------------------------------
Intel Technology Poland sp. z o.o.
KRS 101882
ul. Slowackiego 173, 80-298 Gdansk
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe<http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
--
Kevin Benton
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe<http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150807/b3b3f6e9/attachment.html>
More information about the OpenStack-dev
mailing list