[neutron][ops] API for viewing HA router states

Slawek Kaplonski skaplons at redhat.com
Tue Aug 18 10:28:20 UTC 2020


Hi,

On Mon, Aug 17, 2020 at 11:39:44AM -0400, Assaf Muller wrote:
> On Mon, Aug 17, 2020 at 10:19 AM Mohammed Naser <mnaser at vexxhost.com> wrote:
> >
> > Hi all:
> >
> > What Fabian is describing is exactly the problem we're having, there
> > are _many_ routers in these environments so we'd be looking at N
> > requests which can get out of control quickly
> 
> I think it's a clear use case to implement a new API endpoint that
> returns HA state per agent for *all* routers in a single call. Should
> be easy to implement.

I agree with that. Can You maybe propose official RFE for that and describe
there Your use case - see [1] for details.

> 
> >
> > Thanks
> > Mohammed
> >
> > On Mon, Aug 17, 2020 at 10:05 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > yes for 1 router, but doing this in a loop for hundreds is not so performant ;)
> > >
> > >  Fabian
> > >
> > > Am Mo., 17. Aug. 2020 um 16:04 Uhr schrieb Assaf Muller <amuller at redhat.com>:
> > > >
> > > > On Mon, Aug 17, 2020 at 9:59 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > I can just tell you that we are doing a similar check for dhcp-agent, but here we just execute a suitable SQL-statement to detect more than 1 agent / AZ.
> > > > >
> > > > > Doing the same for L3 shouldn't be that hard, but I dont know if this is what you are looking for?
> > > >
> > > > There's already an API for this:
> > > > neutron l3-agent-list-hosting-router <router_id>
> > > >
> > > > It will show you the HA state per L3 agent for the given router.
> > > >
> > > > >
> > > > >  Fabian
> > > > >
> > > > >
> > > > > Am Mo., 17. Aug. 2020 um 14:11 Uhr schrieb Mohammed Naser <mnaser at vexxhost.com>:
> > > > >>
> > > > >> Hi all,
> > > > >>
> > > > >> Over the past few days, we were troubleshooting an issue that ended up
> > > > >> having a root cause where keepalived has somehow ended up active in
> > > > >> two different L3 agents.  We've yet to find the root cause of how this
> > > > >> happened but removing it and adding it resolved the issue for us.
> > > > >>
> > > > >> As we work on improving our monitoring, we wanted to implement
> > > > >> something that gets us the info of # of active routers to check if
> > > > >> there's a router that has >1 active L3 agent but it's hard because
> > > > >> hitting the /l3-agents endpoint on _every_ single router hurts a lot
> > > > >> on performance.
> > > > >>
> > > > >> Is there something else that we can watch which might be more
> > > > >> productive?  FYI -- this all goes in the open and will end up inside
> > > > >> the openstack-exporter:
> > > > >> https://github.com/openstack-exporter/openstack-exporter and the Helm
> > > > >> charts will end up with the alerts:
> > > > >> https://github.com/openstack-exporter/helm-charts
> > > > >>
> > > > >> Thanks!
> > > > >> Mohammed
> > > > >>
> > > > >> --
> > > > >> Mohammed Naser
> > > > >> VEXXHOST, Inc.
> > > > >>
> > > >
> >
> >
> >
> > --
> > Mohammed Naser
> > VEXXHOST, Inc.
> >
> 
> 

[1] https://docs.openstack.org/neutron/latest/contributor/policies/blueprints.html#neutron-request-for-feature-enhancements

-- 
Slawek Kaplonski
Principal software engineer
Red Hat




More information about the openstack-discuss mailing list