[neutron]Some floating IPs inaccessible after restart of L3 agent
kamil.madac at slovenskoit.sk
Mon Nov 29 09:34:56 UTC 2021
Thanks for responding and suggestion. During the weekend I upgraded neutron l3 agent to most recent victoria version of kolla container (17.2.2.dev56) and it seems it helped -> No disappearing routes in fip namespace anymore after restart 🙂
I found change set which fixes race condition in l3 agent https://review.opendev.org/c/openstack/neutron/+/803576 from September this year and I think that could be the one which fixes it.
From: Michal Arbet <michal.arbet at ultimum.io>
Sent: Monday, November 29, 2021 10:20 AM
To: Kamil Madáč <kamil.madac at slovenskoit.sk>
Cc: openstack-discuss <openstack-discuss at lists.openstack.org>
Subject: Re: [neutron]Some floating IPs inaccessible after restart of L3 agent
I've just read email on phone quickly, and I remember that I've fixed something similar in Debian Victoria packages. Maybe it's your issue, but can't check right now.
Could you check it ? It's fixed in newer versions of neutron.
Michal Arbet (kevko)
Dňa pi 26. 11. 2021, 10:53 Kamil Madáč <kamil.madac at slovenskoit.sk<mailto:kamil.madac at slovenskoit.sk>> napísal(a):
We have openstack Victoria deployed since the beginning of the year with kolla/ansible in docker containers. Everything was running OK, but few weeks ago we noticed issues with networking. Our installation uses Openvswitch networking with DVR non HA routers.
Everything is running smoothly until we restart L3 agent. After that, some floating ips of VMs running on the node where L3 agent is running becomes inaccessible. Workaround is to reassign floating IP to affected VM. Every restart affects same floating IPs and VMs.
No errors/excpetions found in logs.
I was able to find out that after restart there are missing routes for those particular floating IPs in fip- namespace, which causes that proxy arp responses are not working. After floating IP address is reassigned, routes are added by L3 agent and floating IP is working again.
Looks like some sort of race condition in L3 agent, but I was not able to identify any possible existing bug.
L3 agent is in version 17.0.1.dev44.
Is anyone aware of any existing bug which could explain such behavior, or does anyone have idea how to solve the issue?
Slovensko IT a.s.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the openstack-discuss