[stein][neutron] l3 agent error
Hello Guys, I've just upgraded from openstack queens to rocky and then to stein on centos 7. In my configuration I have router high availability. After the upgrade and rebooting each controller one by one I get the following errors on all my 3 controllers under /var/log/neutron/l3-agent.log http://paste.openstack.org/show/805407/ If I run openstack router show for for one of uuid in the log: http://paste.openstack.org/show/805408/ Namespace for router in present on all 3 controllers. After the controllers reboot,some routers lost their routing tables, but restarting l3-agent they went ok. Is it possible router ha stopped working? Any idea,please ? Ignazio
Hello All, the l3 agent respawning error, cause neutron to fill controller memory and the controller stops to responding and it is fenced by the others. So the router ha move some jobs to another controller and it fills its memory and so on. I stopped neutron services and I cleaned HA directory under /var/lib/neutron. Restarting neutron services respawning errors disappeared but I had to create again some router static routes (not all). Probably it is a bug ? Ignazio Il Sab 15 Mag 2021, 19:41 Ignazio Cassano <ignaziocassano@gmail.com> ha scritto:
Hello Guys, I've just upgraded from openstack queens to rocky and then to stein on centos 7. In my configuration I have router high availability. After the upgrade and rebooting each controller one by one I get the following errors on all my 3 controllers under /var/log/neutron/l3-agent.log
http://paste.openstack.org/show/805407/
If I run openstack router show for for one of uuid in the log: http://paste.openstack.org/show/805408/
Namespace for router in present on all 3 controllers. After the controllers reboot,some routers lost their routing tables, but restarting l3-agent they went ok. Is it possible router ha stopped working? Any idea,please ? Ignazio
Hi, Dnia poniedziałek, 17 maja 2021 07:43:20 CEST Ignazio Cassano pisze:
Hello All, the l3 agent respawning error, cause neutron to fill controller memory and the controller stops to responding and it is fenced by the others. So the router ha move some jobs to another controller and it fills its memory and so on.
Do I understand correctly that You have some memory leak in the neutron? If so, is it in neutron-server or neutron-l3-agent? And also, if that is true, can You open LP bug for that and provide some more info, like how many routers do You have there how to maybe reproduce it, etc.
I stopped neutron services and I cleaned HA directory under /var/lib/neutron. Restarting neutron services respawning errors disappeared but I had to create again some router static routes (not all). Probably it is a bug ? Ignazio
Il Sab 15 Mag 2021, 19:41 Ignazio Cassano <ignaziocassano@gmail.com> ha
scritto:
Hello Guys, I've just upgraded from openstack queens to rocky and then to stein on centos 7. In my configuration I have router high availability. After the upgrade and rebooting each controller one by one I get the following errors on all my 3 controllers under /var/log/neutron/l3- agent.log
http://paste.openstack.org/show/805407/
If I run openstack router show for for one of uuid in the log: http://paste.openstack.org/show/805408/
Namespace for router in present on all 3 controllers. After the controllers reboot,some routers lost their routing tables, but restarting l3-agent they went ok. Is it possible router ha stopped working? Any idea,please ? Ignazio
-- Slawek Kaplonski Principal Software Engineer Red Hat
Yes, you understood well. I have not verified if neutron-server o l3-agent fills the memory. The number of routers is 21. I fired the following bug: https://bugs.launchpad.net/neutron/+bug/1928675 Ignazio Il giorno lun 17 mag 2021 alle ore 09:01 Slawek Kaplonski < skaplons@redhat.com> ha scritto:
Hi,
Dnia poniedziałek, 17 maja 2021 07:43:20 CEST Ignazio Cassano pisze:
Hello All, the l3 agent respawning error, cause neutron to fill controller memory and the controller stops to responding and it is fenced by the others. So the router ha move some jobs to another controller and it fills its memory and so on.
Do I understand correctly that You have some memory leak in the neutron? If so, is it in neutron-server or neutron-l3-agent? And also, if that is true, can You open LP bug for that and provide some more info, like how many routers do You have there how to maybe reproduce it, etc.
I stopped neutron services and I cleaned HA directory under /var/lib/neutron. Restarting neutron services respawning errors disappeared but I had to create again some router static routes (not all). Probably it is a bug ? Ignazio
Il Sab 15 Mag 2021, 19:41 Ignazio Cassano <ignaziocassano@gmail.com> ha
scritto:
Hello Guys, I've just upgraded from openstack queens to rocky and then to stein on centos 7. In my configuration I have router high availability. After the upgrade and rebooting each controller one by one I get the following errors on all my 3 controllers under /var/log/neutron/l3- agent.log
http://paste.openstack.org/show/805407/
If I run openstack router show for for one of uuid in the log: http://paste.openstack.org/show/805408/
Namespace for router in present on all 3 controllers. After the controllers reboot,some routers lost their routing tables, but restarting l3-agent they went ok. Is it possible router ha stopped working? Any idea,please ? Ignazio
-- Slawek Kaplonski Principal Software Engineer Red Hat
participants (2)
-
Ignazio Cassano
-
Slawek Kaplonski