[openstack-dev] [neutron] - L3 flavors and issues with use cases for multiple L3 backends

Doug Wiegley dougwig at parksidesoftware.com
Tue Feb 2 04:17:57 UTC 2016


Yes, scheduling was a big gnarly wart that was punted for the first pass. The intention was that any driver you put in a single flavor had equivalent capabilities/plumbed to the same networks/etc.

doug


> On Feb 1, 2016, at 7:08 AM, Kevin Benton <blak111 at gmail.com> wrote:
> 
> Hi all,
> 
> I've been working on an implementation of the multiple L3 backends RFE[1] using the flavor framework and I've run into some snags with the use-cases.[2]
> 
> The first use cases are relatively straightforward where the user requests a specific flavor and that request gets dispatched to a driver associated with that flavor via a service profile. However, several of the use-cases are based around the idea that there is a single flavor with multiple drivers and a specific driver will need to be used depending on the placement of the router interfaces. i.e. a router cannot be bound to a driver until an interface is attached.
> 
> This creates some painful coordination problems amongst drivers. For example, say the first two networks that a user attaches a router to can be reached by all drivers because they use overlays so the first driver chosen by the framework works  fine. Then the user connects to an external network which is only reachable by a different driver. Do we immediately reschedule the entire router at that point to the other driver and interrupt the traffic between the first two networks?
> 
> Even if we are fine with a traffic interruption for rescheduling, what should we do when a failure occurs half way through switching over because the new driver fails to attach to one of the networks (or the old driver fails to detach from one)? It would seem the correct API experience would be switch everything back and then return a failure to the caller trying to add an interface. This is where things get messy.
> 
> If there is a failure during the switch back, we now have a single router's resources smeared across two drivers. We can drop the router into the ERROR state and re-attempt the switch in a periodic task, or maybe just leave it broken.
> 
> How should we handle this much orchestration? Should we pull in something like taskflow, or maybe defer that use case for now?
> 
> What I want to avoid is what happened with ML2 where error handling is still a TODO in several cases. (e.g. Any post-commit update or delete failures in mechanism drivers will not trigger a revert in state.)
> 
> 1. https://bugs.launchpad.net/neutron/+bug/1461133 <https://bugs.launchpad.net/neutron/+bug/1461133>
> 2. https://etherpad.openstack.org/p/ <https://etherpad.openstack.org/p/neutron-modular-l3-router-plugin-use-cases>neutron-modular-l3-router-plugin-use-cases <https://etherpad.openstack.org/p/neutron-modular-l3-router-plugin-use-cases>
> -- 
> Kevin Benton
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160201/9d4a4c44/attachment.html>


More information about the OpenStack-dev mailing list