[openstack-dev] [Neutron][L3] Representing a networks connected by routers

Salvatore Orlando salv.orlando at gmail.com
Tue Jul 21 14:41:32 UTC 2015

A few comments inline.

Generally speaking the only thing I'd like to remark is that this use case
makes sense independently of whether you are using overlay, or any other
"SDN" solution (whatever SDN means to you).

Also, please note that this thread is now split in two - there's a new
branch starting with Ian's post. So perhaps let's make two threads.

On 21 July 2015 at 14:21, Neil Jerram <Neil.Jerram at metaswitch.com> wrote:

> On 20/07/15 18:36, Carl Baldwin wrote:
> > I'm looking for feedback from anyone interest but, in particular, I'd
> > like feedback from the following people for varying perspectives:
> > Mark McClain (proposed alternate), John Belamaric (IPAM), Ryan Tidwell
> > (BGP), Neil Jerram (L3 networks), Aaron Rosen (help understand
> > multi-provider networks) and you if you're reading this list of names
> > and thinking "he forgot me!"
> >
> > We have been struggling to develop a way to model a network which is
> > composed of disjoint L2 networks connected by routers.  The intent of
> > this email is to describe the two proposals and request input on the
> > two in attempt to choose a direction forward.  But, first:
> > requirements.
> >
> > Requirements:
> >
> > The network should appear to end users as a single network choice.
> > They should not be burdened with choosing between segments.  It might
> > interest them that L2 communications may not work between instances on
> > this network but that is all.

It is however important to ensure services like DHCP keep working as usual.
Treating segments as logical networks in their own right is the simples
solution to achieve this imho.

> This has been requested by numerous
> > operators [1][4].  It can be useful for external networks and provider
> > networks.
> >
> > The model needs to be flexible enough to support two distinct types of
> > addresses:  1) address blocks which are statically bound to a single
> > segment and 2) address blocks which are mobile across segments using
> > some sort of dynamic routing capability like BGP or programmatically
> > injecting routes in to the infrastructure's routers with a plugin.
> FWIW, I hadn't previously realized (2) here.

A "mobile address block" translates to a subnet whose network association
might change.
Achieving mobile address block does not seem simple to me at all. Route
injection (booring) and BGP might solve the networking aspect of the
problem, but we'd need also coordination with the compute service to ensure
also all the workloads using addresses from the mobile block migrate;
unless I've not understood the way these mobile address blocks work, I
struggle to see this as a requirement.

> >
> > Overlay networks are not the answer to this.  The goal of this effort
> > is to scale very large networks with many connected ports by doing L3
> > routing (e.g. to the top of rack) instead of using a large continuous
> > L2 fabric.

As a side note, I find interesting that overlays where indeed proposed as a
solution to avoid hybrid L2/L3 networks or having to span VLANs across the
core and aggregation layers.

> Also, the operators interested in this work do not want
> > the complexity of overlay networks [4].
> >
> > Proposal 1:
> >
> > We refined this model [2] at the Neutron mid-cycle a couple of weeks
> > ago.  This proposal has already resonated reasonably with operators,
> > especially those from GoDaddy who attended the Neutron sprint.  Some
> > key parts of this proposal are:
> >
> > 1.  The routed super network is called a front network.  The segments
> > are called back(ing) networks.
> > 2.  Backing networks are modeled as admin-owned private provider
> > networks but otherwise are full-blown Neutron networks.
> > 3.  The front network is marked with a new provider type.
> > 4.  A Neutron router is created to link the backing networks with
> > internal ports.  It represents the collective routing ability of the
> > underlying infrastructure.
> > 5.  Backing networks are associated with a subset of hosts.
> > 6.  Ports created on the front network must have a host binding and
> > are actually created on a backing network when all is said and done.
> > They carry the ID of the backing network in the DB.

While the logical model and workflow you describe here makes sense, I have
the impression that:
1) The front network is not a neutron logical network. Because it does not
really behave like a network, with the only exception that you can pass its
id to the nova API. To reinforce this consider that basically the front
network has no ports.
2) from a topological perspective the front network "kind of" behaves like
an external network; but it isn't. The front network is not really a common
gateway for all backing networks, more like a label which is attached to
the router which interconnects all the backing networks.
3) more on topology. How can we know that all these segments will always be
connected by a single logical router? Using static router (or If one day
BGP will be a thing), it is already possible to implement multi-segments
networks with L3 connectivity using multiple logical routers, isn't it?
4) Point #5 is making assumptions on network aware scheduling. I am not
sure we already have the ability to inform the nova scheduler to deploy an
instance on a host where a give network is available.
5) I think that I would treat the "front" network as a "network group" or
"cluster". I noticed the term "subnet cluster" is used in the etherpad. I
find this term appropriate because it seems to me that in this scenario the
final user does not care at all about the network intended as a L2 segment.
6) It seems one of the purposes of using backing networks is to identify an
address block for the ports being created. But then how would that play
with mobile address blocks? From an instance workflow perspective, should
instances be associated with one or more address blocks at boot time?
7) What happens is a user attaches a router to a backing network and
connect that router to an external network? Does that becomes a gateway for
all backing networks or just for that network? And would the workflow be
for uplinking a front network to an external network?

> >
> > Using Neutron networks to model the segments allows us to fully
> > specify the details of each network using the regular Neutron model.
> > They could be heterogeneous or homogeneous, it doesn't matter.
> You've probably seen Robert Kukura's comment on the related bug at
> https://bugs.launchpad.net/neutron/+bug/1458890/comments/30, and there
> is a useful detailed description of how the multiprovider extension
> works at
> https://bugs.launchpad.net/openstack-api-site/+bug/1242019/comments/3.
> I believe it is correct to say that using multiprovider would be an
> effective substitute for using multiple backing networks with different
> {network_type, physical_network, segmentation_id}, and that logically
> multiprovider is aiming to describe the same thing as this email thread
> is, i.e. non-overlay mapping onto a physical network composed of
> multiple segments.

> However, I believe multiprovider does not (per se) address the IP
> addressing requirement(s) of the multi-segment scenario.

Indeed it does not. The multiprovider extension simply indicates that a
network can be built using different L2 segments.
It is then up to the operator to ensure that these segments are correct,
and it's up to whatever is running in the backend to ensure that instances
on the various segments can communicate each other.

I believe the ask here is for Neutron to provide this capability (the
neutron reference control plane currently doesn't). It is not yet entirely
clear to me whether there's a real need of changing the logical model, but
IP addressing implications might be a reason, as pointed out by Neil.

> >
> > This proposal offers a clear separation between the statically bound
> > and the mobile address blocks by associating the former with the
> > backing networks and the latter with the front network.  The mobile
> > addresses are modeled just like floating IPs are today but are
> > implemented by some plugin code (possibly without NAT).
> Couldn't the mobile addresses be _exactly_ like floating IPs already
> are?  Why is anything different from floating IPs needed here?
> >
> > This proposal also provides some advantages for integrating dynamic
> > routing.  Since each backing network will, by necessity, have a
> > corresponding router in the infrastructure, the relationship between
> > dynamic routing speaker, router, and network is clear in the model:
> > network <-> speaker <-> router.

Ok. But how that changes because of backing networks? I believe the same
relationship holds true for every network, or am I wrong?

> I'm not sure exactly what you mean here by 'dynamic routing', but I
> think this touches on a key point: can IP routing happen anywhere in a
> Neutron network, without being explicitly represented by a router object
> in the model?
> I think the answer to that should be yes.

But this would also mean that we should consider doing without the very
concept of router in Neutron.
If we look at the scenarios we're describing here, I'd agree with you, but
unfortunately Neutron is required to serve a wide variety of scenarios.

> It clearly already is in the
> underlay if you are using tunnels - the tunnel between two compute hosts
> may require multiple IP hops across the fabric.  At the network level
> that Neutron networks currently model, the answer is currently no, but I
> think it's interesting to consider changing that.
> >
> > Proposal 2:
> >
> > This alternate model has not been fully fleshed out.
> I should begin by admitting the blame here.  Much of this is a
> half-baked idea from me, that I haven't yet had time to explore
> properly.  However....
> >   Some parts of it
> > are still unclear to me.  The basic idea is to give the IPAM system
> > information about IP availability on a given host.  When creating a
> > port, the binding information would be sent to the IPAM system and the
> > system would choose an appropriate address block for the allocation.

To make a link to proposal #1, I read this as informing the IPAM system of
which baking network(s) can be implemented on the host which has been
But I am not 100% convinced that the two proposals implement the same

> Right.  A key requirement, for this to be possible, is that Nova's host
> selection happens before the IPAM system is asked to allocate an IP
> address.  I have an action to investigate that, but if anyone happens to
> know already, please do say.

I am 99.99% sure this is not possible at the moment unless something is
done to make nova scheduler network aware.
Also, this will add a point of coupling between the instance boot and
network provisioning processes, which are independent at the moment.

> >
> > 1. This alternate model offers no way to distinguish the two types of
> > address blocks.
> Agreed.  But I wonder if normal floating IPs can be used for the mobile
> IP addresses (as also suggested above).

I get the concept, but it's not really a floating IP in neutron terms, as
that implies SNAT/DNAT.
Also, from what I gather it's not about single mobile addresses, but we're
talking about entire subnets that can be moved around.

> > 2. We don't have the benefit of modeling the segments with Neutron
> networks.
> Agreed, but it appears that multiprovider has already taken a different
> view here, and already provides the ability for a network to map to
> multiple {network_type, physical_network, segmentation_id} tuples.

Modelling segments as logical networks is not necessarily a benefit in my
it's more a convenience. For instance the reference control plane might
implement provider networks in a way such that:
1) a "ghost router" is created in the l3 agent to ensure E-W traffic across
all segments (the router is "ghost" because it's not exposed as neutron
logical router
2) a distinct dnsmasq instance is started on every segment of the network
to ensure DHCP functionality
3) metadata services can be provided through the ghost router rather than
using isolated metadata

I think this alternative is worth exploring anyway.

> >
> > It was suggested that hierarchical port binding could help here but I
> > see it as orthogonal to this.  Hierarchical port binding extends the
> > L2 properties of a port to a hierarchical infrastructure to achieve
> > continuous L2 connectivity.  It is also intended for overlay networks.
> > That isn't what we're doing here and I don't think it fits.
> >
> > I have also considered the multi-provider extension [3] for this.
> > This is not yet clear to me either.  First, my understanding was that
> > this extension describes multi-segment continuous L2 fabrics.
> https://bugs.launchpad.net/openstack-api-site/+bug/1242019/comments/3
> says:
> "Note that, although ML2 can manage binding to multi-segment networks,
> neutron does not manage bridging between the segments of a multi-segment
> network. This is assumed to be done administratively."
> So I think it is not intended for a multiprovider network to be
> "continuous".
> Again, this touches on the point above about routing happening without
> being explicitly represented in the Neutron model...
> >   Second,
> > there doesn't seem to be any host binding aspect to the multi-provider
> > extension.  Third, not all L2 plugins support this extension.  It
> > seems silly to require L2 plugin support in order to enable routing
> > between segments.
> Good point.  If all plugins required the same kind of transformation to
> support multiprovider, perhaps that's telling us that the multi-ness
> should instead be in a layer above, more like your proposal 1.
> >
> > It isn't clear to me how a dynamic routing speaker will fit in to this
> > model.  My first thought is that it must be integrated with IPAM
> > because the IPAM system has the understanding of how to map address
> > blocks to infrastructure.  This pushes even more infrastructure
> > knowledge down to the IPAM system.  If dynamic routing is pushed down
> > to the IPAM system, it will also be necessary to push the association
> > of mobile IPs or routed tenant subnets down in to the IPAM system too.
> > This means Neutron needs to tell IPAM about every floating IP
> > association and every tenant subnet behind a Neutron router in the
> > same address scope as the external network.  I'm not convinced that
> > IPAM and routing really belong together like this.
> I'm afraid I don't yet sufficiently understand the 'dynamic routing'
> requirements here.  Can you say more about them?
> >
> > If you made it this far in this email, you must have some feedback.
> > Please help us out.
> There are a lot of moving parts here.  I'm afraid I don't yet see any
> clarity, but perhaps if we talk about this enough, that will eventually
> emerge!
> Regards,
>     Neil
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150721/296d8e89/attachment.html>

More information about the OpenStack-dev mailing list