[openstack-dev] [Neutron][L2Pop][HA Routers] Request for comments for a possible solution

Mike Kolesnik mkolesni at redhat.com
Thu Dec 18 17:28:21 UTC 2014

Hi Mathieu,

Thanks for the quick reply, some comments inline..


----- Original Message -----
> Hi mike,
> thanks for working on this bug :
> On Thu, Dec 18, 2014 at 1:47 PM, Gary Kotton <gkotton at vmware.com> wrote:
> >
> >
> > On 12/18/14, 2:06 PM, "Mike Kolesnik" <mkolesni at redhat.com> wrote:
> >
> >>Hi Neutron community members.
> >>
> >>I wanted to query the community about a proposal of how to fix HA routers
> >>not
> >>working with L2Population (bug 1365476[1]).
> >>This bug is important to fix especially if we want to have HA routers and
> >>DVR
> >>routers working together.
> >>
> >>[1] https://bugs.launchpad.net/neutron/+bug/1365476
> >>
> >>What's happening now?
> >>* HA routers use distributed ports, i.e. the port with the same IP & MAC
> >>  details is applied on all nodes where an L3 agent is hosting this
> >>router.
> >>* Currently, the port details have a binding pointing to an arbitrary node
> >>  and this is not updated.
> >>* L2pop takes this "potentially stale" information and uses it to create:
> >>  1. A tunnel to the node.
> >>  2. An FDB entry that directs traffic for that port to that node.
> >>  3. If ARP responder is on, ARP requests will not traverse the network.
> >>* Problem is, the master router wouldn't necessarily be running on the
> >>  reported agent.
> >>  This means that traffic would not reach the master node but some
> >>arbitrary
> >>  node where the router master might be running, but might be in another
> >>  state (standby, fail).
> >>
> >>What is proposed?
> >>Basically the idea is not to do L2Pop for HA router ports that reside on
> >>the
> >>tenant network.
> >>Instead, we would create a tunnel to each node hosting the HA router so
> >>that
> >>the normal learning switch functionality would take care of switching the
> >>traffic to the master router.
> >
> > In Neutron we just ensure that the MAC address is unique per network.
> > Could a duplicate MAC address cause problems here?
> gary, AFAIU, from a Neutron POV, there is only one port, which is the
> router Port, which is plugged twice. One time per port.
> I think that the capacity to bind a port to several host is also a
> prerequisite for a clean solution here. This will be provided by
> patches to this bug :
> https://bugs.launchpad.net/neutron/+bug/1367391
> >>This way no matter where the master router is currently running, the data
> >>plane would know how to forward traffic to it.
> >>This solution requires changes on the controller only.
> >>
> >>What's to gain?
> >>* Data plane only solution, independent of the control plane.
> >>* Lowest failover time (same as HA routers today).
> >>* High backport potential:
> >>  * No APIs changed/added.
> >>  * No configuration changes.
> >>  * No DB changes.
> >>  * Changes localized to a single file and limited in scope.
> >>
> >>What's the alternative?
> >>An alternative solution would be to have the controller update the port
> >>binding
> >>on the single port so that the plain old L2Pop happens and notifies about
> >>the
> >>location of the master router.
> >>This basically negates all the benefits of the proposed solution, but is
> >>wider.
> >>This solution depends on the report-ha-router-master spec which is
> >>currently in
> >>the implementation phase.
> >>
> >>It's important to note that these two solutions don't collide and could
> >>be done
> >>independently. The one I'm proposing just makes more sense from an HA
> >>viewpoint
> >>because of it's benefits which fit the HA methodology of being fast &
> >>having as
> >>little outside dependency as possible.
> >>It could be done as an initial solution which solves the bug for mechanism
> >>drivers that support normal learning switch (OVS), and later kept as an
> >>optimization to the more general, controller based, solution which will
> >>solve
> >>the issue for any mechanism driver working with L2Pop (Linux Bridge,
> >>possibly
> >>others).
> >>
> >>Would love to hear your thoughts on the subject.
> You will have to clearly update the doc to mention that deployment
> with Linuxbridge+l2pop are not compatible with HA.

Yes this should be added and this is already the situation right now.
However if anyone would like to work on a LB fix (the general one or some
specific one) I would gladly help with reviewing it.

> Moreover, this solution is downgrading the l2pop solution, by
> disabling the ARP-responder when VMs want to talk to a HA router.
> This means that ARP requests will be duplicated to every overlay
> tunnel to feed the OVS Mac learning table.
> This is something that we were trying to avoid with l2pop. But may be
> this is acceptable.

Yes basically you're correct, however this would be only limited to those
tunnels that connect to the nodes where the HA router is hosted, so we
would still limit the amount of traffic that is sent across the underlay.

Also bear in mind that ARP is actually good (at least in OVS case) since
it helps the VM locate on which tunnel the master is, so once it receives
the ARP response it records a flow that directs the traffic to the correct
tunnel, so we just get hit by the one ARP broadcast but it's sort of a
necessary evil in order to locate the master..

> I know that ofagent is also using l2pop, I would like to know if
> ofagent deployment will be compatible with the workaround that you are
> proposing.

I would like to know that too, hopefully someone from OFagent can shed
some light.

> My concern is that, with DVR, there are at least two major features
> that are not compatible with Linuxbridge.
> Linuxbridge is not running in the gate. I don't know if anybody is
> running a 3rd party testing with Linuxbridge deployments. If anybody
> does, it would be great to have it voting on gerrit!
> But I really wonder what is the future of linuxbridge compatibility?
> should we keep on improving OVS solution without taking into account
> the linuxbridge implementation?

I don't know actually, but my capability is to fix it for OVS the best
way possible.
As I said the situation for LB won't become worse than it already is,
legacy routers would till function as always.. This fix also will not
block fixing LB in any other way since it can be easily adjusted (if
necessary) to work only for supporting mechanisms (OVS AFAIK).

Also if anyone is willing to pick up the glove and implement
the general controller based fix, or something more focused on LB I will
happily help review what I can.

> Regards,
> Mathieu
> >>
> >>Regards,
> >>Mike
> >>
> >>_______________________________________________
> >>OpenStack-dev mailing list
> >>OpenStack-dev at lists.openstack.org
> >>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

More information about the OpenStack-dev mailing list