Open Stack

Mon Feb 1 14:08:05 UTC 2016

Hi all,

I've been working on an implementation of the multiple L3 backends RFE[1]
using the flavor framework and I've run into some snags with the
use-cases.[2]

The first use cases are relatively straightforward where the user requests
a specific flavor and that request gets dispatched to a driver associated
with that flavor via a service profile. However, several of the use-cases
are based around the idea that there is a single flavor with multiple
drivers and a specific driver will need to be used depending on the
placement of the router interfaces. i.e. a router cannot be bound to a
driver until an interface is attached.

This creates some painful coordination problems amongst drivers. For
example, say the first two networks that a user attaches a router to can be
reached by all drivers because they use overlays so the first driver chosen
by the framework works  fine. Then the user connects to an external network
which is only reachable by a different driver. Do we immediately reschedule
the entire router at that point to the other driver and interrupt the
traffic between the first two networks?

Even if we are fine with a traffic interruption for rescheduling, what
should we do when a failure occurs half way through switching over because
the new driver fails to attach to one of the networks (or the old driver
fails to detach from one)? It would seem the correct API experience would
be switch everything back and then return a failure to the caller trying to
add an interface. This is where things get messy.

If there is a failure during the switch back, we now have a single router's
resources smeared across two drivers. We can drop the router into the ERROR
state and re-attempt the switch in a periodic task, or maybe just leave it
broken.

How should we handle this much orchestration? Should we pull in something
like taskflow, or maybe defer that use case for now?

What I want to avoid is what happened with ML2 where error handling is
still a TODO in several cases. (e.g. Any post-commit update or delete
failures in mechanism drivers will not trigger a revert in state.)

1. https://bugs.launchpad.net/neutron/+bug/1461133
2. https://etherpad.openstack.org/p/
<https://etherpad.openstack.org/p/neutron-modular-l3-router-plugin-use-cases>
neutron-modular-l3-router-plugin-use-cases
<https://etherpad.openstack.org/p/neutron-modular-l3-router-plugin-use-cases>

-- 
Kevin Benton
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160201/328fa4ef/attachment.html>

Open Stack

[openstack-dev] [neutron] - L3 flavors and issues with use cases for multiple L3 backends

OpenStack

Community

Documentation

Branding & Legal