Hi, yes for 1 router, but doing this in a loop for hundreds is not so performant ;) Fabian Am Mo., 17. Aug. 2020 um 16:04 Uhr schrieb Assaf Muller <amuller@redhat.com>:
On Mon, Aug 17, 2020 at 9:59 AM Fabian Zimmermann <dev.faz@gmail.com> wrote:
Hi,
I can just tell you that we are doing a similar check for dhcp-agent, but here we just execute a suitable SQL-statement to detect more than 1 agent / AZ.
Doing the same for L3 shouldn't be that hard, but I dont know if this is what you are looking for?
There's already an API for this: neutron l3-agent-list-hosting-router <router_id>
It will show you the HA state per L3 agent for the given router.
Fabian
Am Mo., 17. Aug. 2020 um 14:11 Uhr schrieb Mohammed Naser <mnaser@vexxhost.com>:
Hi all,
Over the past few days, we were troubleshooting an issue that ended up having a root cause where keepalived has somehow ended up active in two different L3 agents. We've yet to find the root cause of how this happened but removing it and adding it resolved the issue for us.
As we work on improving our monitoring, we wanted to implement something that gets us the info of # of active routers to check if there's a router that has >1 active L3 agent but it's hard because hitting the /l3-agents endpoint on _every_ single router hurts a lot on performance.
Is there something else that we can watch which might be more productive? FYI -- this all goes in the open and will end up inside the openstack-exporter: https://github.com/openstack-exporter/openstack-exporter and the Helm charts will end up with the alerts: https://github.com/openstack-exporter/helm-charts
Thanks! Mohammed
-- Mohammed Naser VEXXHOST, Inc.