Hi, are the control nodes heavily used? I've seen a very similar behaviour where one control node failed in a HA environment, at that time we only had two control nodes (the third was in repair but we had already built the cluster). Some existing routers didn't work anymore but we could create new ones and they also worked fine. Can you verify that you can create new routers in those projects and their respective namespaces get created successfully? Somewhere in the neutron-server logs I found a hint that the L3-agent was overloaded or something like that, I can't exactly remember the details. What helped in that situation was to stop all openstack services on that control node entirely (also remove all dnsmasq processes), but not a simple reboot because that would just start all the services again via systemd/pacemaker and I didn't want that, I wanted a controlled procedure. So after killing all remaining processes I started with neutron-server, then the rest of the neutron agents one by one, always watching the logs. This procedure brought up the neutron services successfully, then I started the rest of the services as well. We haven't seen that behaviour since the third node joined the cluster. Maybe in such a scenario dedicated network nodes can make a difference, not colocate with all the other openstack services. But we were limited in the available hardware. Regards, Eugen Zitat von Sławek Kapłoński <skaplons@redhat.com>:
Hi,
Please don't drop openstack-discuss ML from the thread.
Dnia piątek, 16 lutego 2024 04:39:05 CET keshav bareja pisze:
Hi
I can see below status in the neutron command:
[image: image]
and qrouter namespace is missing in controller nodes for most of the projects. so instances are not accessible. I tried to check errors in l3-agent but no major indication. tried restarting all neutron services in all the 3 controllers but no success. Also tried turning ha-mode disabled/enable but it seems qrouter namespace is not getting generated. All this issue started after reboot of one of the control nodes.
How can I trigger the creation of qrouter namespace or any other suggestions?
Is it only qrouter namespace missing? Are snat-<router_id> namespaces for example created fine or also missing? Regarding triggering it manually You can't really do that in other way than restarting L3 agent on the node. But I guess there may be something in logs if the namespaces aren't created fine. Please enable debug logs in L3 agent, restart it and then share its logs with us.
Regards Keshav Bareja
On Thu, Feb 15, 2024 at 9:19 PM keshav bareja <keshav.bareja@gmail.com> wrote:
Hi
We have DVR ha-mode activated , router for each project and showing as qrouter-() on each of the controller.
For some project router , there is no qrouter-(router-id) in any of the controller node.
Regards Keshav
Sent from my iPhone
On 15 Feb 2024, at 8:27 PM, Sławek Kapłoński <skaplons@redhat.com> wrote:
Hi,
Hi Team
We are experiencing an issue of reachability in some of the projects, instances in the project are not reachable. It is a project specific issue. I tried to check router details and could observe . For the project having issues, I checked the router and could see that
qrouter namespace is not present in the command "ip netns" in any of the controllers.
Problem started after a reboot of one controller.
Do any one has faced a similar issue and any suggestions to recover
Dnia czwartek, 15 lutego 2024 18:37:01 CET keshav bareja pisze: the this?
Regards Keshav Bareja
Any errors in the L3 agent logs on controllers? Are those DVR or centralized routers?
-- Slawek Kaplonski Principal Software Engineer Red Hat
-- Slawek Kaplonski Principal Software Engineer Red Hat