[neutron][ops] API for viewing HA router states

Assaf Muller amuller at redhat.com
Mon Aug 17 15:39:44 UTC 2020


On Mon, Aug 17, 2020 at 10:19 AM Mohammed Naser <mnaser at vexxhost.com> wrote:
>
> Hi all:
>
> What Fabian is describing is exactly the problem we're having, there
> are _many_ routers in these environments so we'd be looking at N
> requests which can get out of control quickly

I think it's a clear use case to implement a new API endpoint that
returns HA state per agent for *all* routers in a single call. Should
be easy to implement.

>
> Thanks
> Mohammed
>
> On Mon, Aug 17, 2020 at 10:05 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:
> >
> > Hi,
> >
> > yes for 1 router, but doing this in a loop for hundreds is not so performant ;)
> >
> >  Fabian
> >
> > Am Mo., 17. Aug. 2020 um 16:04 Uhr schrieb Assaf Muller <amuller at redhat.com>:
> > >
> > > On Mon, Aug 17, 2020 at 9:59 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > I can just tell you that we are doing a similar check for dhcp-agent, but here we just execute a suitable SQL-statement to detect more than 1 agent / AZ.
> > > >
> > > > Doing the same for L3 shouldn't be that hard, but I dont know if this is what you are looking for?
> > >
> > > There's already an API for this:
> > > neutron l3-agent-list-hosting-router <router_id>
> > >
> > > It will show you the HA state per L3 agent for the given router.
> > >
> > > >
> > > >  Fabian
> > > >
> > > >
> > > > Am Mo., 17. Aug. 2020 um 14:11 Uhr schrieb Mohammed Naser <mnaser at vexxhost.com>:
> > > >>
> > > >> Hi all,
> > > >>
> > > >> Over the past few days, we were troubleshooting an issue that ended up
> > > >> having a root cause where keepalived has somehow ended up active in
> > > >> two different L3 agents.  We've yet to find the root cause of how this
> > > >> happened but removing it and adding it resolved the issue for us.
> > > >>
> > > >> As we work on improving our monitoring, we wanted to implement
> > > >> something that gets us the info of # of active routers to check if
> > > >> there's a router that has >1 active L3 agent but it's hard because
> > > >> hitting the /l3-agents endpoint on _every_ single router hurts a lot
> > > >> on performance.
> > > >>
> > > >> Is there something else that we can watch which might be more
> > > >> productive?  FYI -- this all goes in the open and will end up inside
> > > >> the openstack-exporter:
> > > >> https://github.com/openstack-exporter/openstack-exporter and the Helm
> > > >> charts will end up with the alerts:
> > > >> https://github.com/openstack-exporter/helm-charts
> > > >>
> > > >> Thanks!
> > > >> Mohammed
> > > >>
> > > >> --
> > > >> Mohammed Naser
> > > >> VEXXHOST, Inc.
> > > >>
> > >
>
>
>
> --
> Mohammed Naser
> VEXXHOST, Inc.
>




More information about the openstack-discuss mailing list