[openstack-dev] [Neutron][L3] Representing a networks connected by routers

Ian Wells ijw.ubuntu at cack.org.uk
Tue Jul 21 22:55:18 UTC 2015

On 21 July 2015 at 12:11, John Belamaric <jbelamaric at infoblox.com> wrote:

>  Wow, a lot to digest in these threads. If I can summarize my
> understanding of the two proposals. Let me know whether I get this right.
> There are a couple problems that need to be solved:
>   a. Scheduling based on host reachability to the segments

So actually this is something Assaf and I were debating on IRC, and I think
it depends what you're aiming for.

Imagine you have connectivity for a 'network' to every host, but that
connectivity only works if you get a host-specific address because the
address range is different per host.  This seems to be the use case we come
back to.  (There's a corner case of this where the network is not available
on every host and that gets you different requirements, but for now, this.)

You *can* use the current mechanism: allocate address, schedule, run -
providing your scheduler respects the address you've been allocated and
puts you on a host that can reach this address.  This is a silly approach.
You can't tell when getting the address (for a port that is entirely
disassociated from the VM it's going to be attached to, via Neutron when
most of the scheduling constraints live in Nova) that the address is on a
machine that can even run the VM.

You can delay address allocation - then the machine can be scheduled
anywhere because the address it has is not a constraint.  This saves any
change to scheduling at all - normal scheduling rules apply, excepting the
case where addresses are exhausted on that machine, and in that case we'd
probably use the retry mechanism as a fallback to find a better place until
someone works out it's not really just a *nova* scheduler.

>  b. Floating IP functionality across the segments. I am not sure I am
> clear on this one but it sounds like you want the routers attached to the
> segments to advertise routes to the specific floating IPs. Presumably then
> they would do NAT or the instance would assign both the fixed IP and the
> floating IP to its interface?

That's the summary.  And I don't think anyone is clear on this and I also
don't know that anyone has specifically requested this.

In Proposal 1, (a) is solved by associating segments to the front network
> via a router - that association is used to provide a single hook into the
> existing API that limits the scope of segment selection to those associated
> with the front network. (b) is solved by tying the floating IP ranges to
> the same front network and managing the reachability with dynamic routing.
>  In Proposal 2, (a) is solved by tagging each network with some meta-data
> that the IPAM system uses to make a selection.

The distinction is actually pretty small.  The same backing data exists for
the IPAM to use - the difference is only that in (1) it's there as a misuse
of networks and in (2) it's not specified.

> This implies an IP allocation request that passes something other than a
> network/port to the IPAM subsystem.

This is where I started - there is nothing to pass when I run 'neutron
port-create' except for a network and this is where address allocation
happens today.  We need a mechanism to defer address allocation and
indicate that the port has no address right now.

> This fine from the IPAM point of view but there is no corresponding API
> for this right now. To solve (b) either the IPAM system has to publish the
> routes

It needs to ensure there's enough information on the port that the network
controller can push the routes, is the way I think of it.

> or the higher level management has to ALSO be aware of the mappings
> (rather than just IPAM).
>  To throw some fuel on the fire, I would argue also that (a) is not
> sufficient and address availability needs to be considered as well (as
> described in [1]). Selecting a host based on reachability alone will fail
> when addresses are exhausted. Similarly, with (b) I think there needs to be
> consideration during association of a floating IP to the effect on routing.
> That is, rather than a huge number of host routes it would be ideal to
> allocate the floating IPs in blocks that can be associated with the backing
> networks (though we would want to be able to split these blocks as small as
> a /32 if necessary - but avoid it/optimize as much as possible).

Again - the scheduler is simplistic and nova-centric as things stand, and I
think we all recgonise this.  The current fallbacks work, but they're

In fact, I think that these proposals are more or less the same - it's just
> in #1 the meta-data used to tie the backing networks together is another
> network.


> This allows it to fit in neatly with the existing APIs. You would still
> need to implement something prior to IPAM or within IPAM that would select
> the appropriate backing network.
>  As a (gulp) third alternative, we should consider that the front network
> here is in essence a layer 3 domain, and we have modeled layer 3 domains as
> address scopes in Liberty. The user is essentially saying "give me an
> address that is routable in this scope" - they don't care which actual
> subnet it gets allocated on. This is conceptually more in-line with [2] -
> modeling L3 domain separately from the existing Neutron concept of a
> network being a broadcast domain.

Again, the issue is that when you ask for an address you tend to have quite
a strong opinion of what that address should be if it's location-specific.

> Fundamentally, however we associate the segments together, this comes down
> to a scheduling problem.

It's not *solely* a scheduling problem, and that is my issue with this
statement (Assaf has been saying the same).  You *can* solve this
*exclusively* with scheduling (allocate the address up front, hope that the
address has space for a VM with all its constraints met) - but that
solution is horrible; or you can solve this largely with allocation where
scheduling helps to deal with pool exchaustion, where it is mainly another
sort of problem but scheduling plays a part.

Nova needs to be able to incorporate data from Neutron in its scheduling
> decision. Rather than solving this with a single piece of meta-data like
> network_id as described in proposal 1, it probably makes more sense to
> build out the general concept of utilizing network data for nova
> scheduling. We could still model this as in #1, or using address scopes, or
> some arbitrary data as in #2. But the harder problem to solve is the
> scheduling, not how we tag these things to inform that scheduling.
>  The optimization of routing for floating IPs is also a scheduling
> problem, though one that would require a lot more changes to how FIP are
> allocated and associated to solve.
>  John
>  [1] https://review.openstack.org/#/c/180803/
> [2] https://bugs.launchpad.net/neutron/+bug/1458890/comments/7
>   On Jul 21, 2015, at 10:52 AM, Carl Baldwin <carl at ecbaldwin.net> wrote:
>  On Jul 20, 2015 4:26 PM, "Ian Wells" <ijw.ubuntu at cack.org.uk> wrote:
> >
> > There are two routed network models:
> >
> > - I give my VM an address that bears no relation to its location and
> ensure the routed fabric routes packets there - this is very much the
> routing protocol method for doing things where I have injected a route into
> the network and it needs to propagate.  It's also pretty useless because
> there are too many host routes in any reasonable sized cloud.
> >
> > - I give my VM an address that is based on its location, which only
> becomes apparent at binding time.  This means that the semantics of a port
> changes - a port has no address of any meaning until binding, because its
> location is related to what it does - and it leaves open questions about
> what to do when you migrate.
> >
> > Now, you seem to generally be thinking in terms of the latter model,
> particularly since the provider network model you're talking about fits
> there.  But then you say:
> Actually, both.  For example, GoDaddy assigns each vm an ip from the
> location based address blocks and optionally one from the routed location
> agnostic ones.  I would also like to assign router ports out of the
> location based blocks which could host floating ips from the other blocks.
> > On 20 July 2015 at 10:33, Carl Baldwin <carl at ecbaldwin.net> wrote:
> >>
> >> When creating a
> >> port, the binding information would be sent to the IPAM system and the
> >> system would choose an appropriate address block for the allocation.
> Implicit in both is a need to provide at least a hint at host binding.
> Or, delay address assignment until binding.  I didn't mention it because my
> email was already long.
> This is something and discussed but applies equally to both proposals.
> > No, it wouldn't, because creating and binding a port are separate
> operations.  I can't give the port a location-specific address on creation
> - not until it's bound, in fact, which happens much later.
> >
> > On proposal 1: consider the cost of adding a datamodel to Neutron.  It
> has to be respected by all developers, it frequently has to be deployed by
> all operators, and every future change has to align with it.  Plus either
> it has to be generic or optional, and if optional it's a burden to some
> proportion of Neutron developers and users.  I accept proposal 1 is easy,
> but it's not universally applicable.  It doesn't work with Neil Jerram's
> plans, it doesn't work with multiple interfaces per host, and it doesn't
> work with the IPv6 routed-network model I worked on.
> Please be more specific.  I'm not following your argument here.  My
> proposal doesn't really add much new data model.
> We've discussed this with Neil at length.  I haven't been able to
> reconcile our respective approaches in to one model that works for both of
> us and still provides value.  The routed segments model needs to somehow
> handle the L2 details of the underlying network.  Neil's model confines L2
> to the port and routes to it.  The two models can't just be squished
> together unless I'm missing something.
> Could you provide some links so that I can brush up on your ipv6 routed
> network model?  I'd like to consider it but I don't know much about it.
> > Given that, I wonder whether proposal 2 could be rephrased.
> >
> > 1: some network types don't allow unbound ports to have addresses, they
> just get placeholder addresses for each subnet until they're bound
> > 2: 'subnets' on these networks are more special than subnets on other
> networks.  (More accurately, they dont use subnets.  It's a shame subnets
> are core Neutron, because they're pretty horrible and yet hard to replace.)
> > 3: there's an independent (in an extension?  In another API endpoint?)
> datamodel that the network points to and that IPAM refers to to find a port
> an address.  Bonus, people who aren't using funky network types can disable
> this extension.
> > 4: when the port is bound, the IPAM is referred to, and it's told the
> binding information of the port.
> > 5: when binding the port, once IPAM has returned its address, the
> network controller probably does stuff with that address when it completes
> the binding (like initialising routing).
> > 6: live migration either has to renumber a port or forward old traffic
> to the new address via route injection.  This is an open question now, so
> I'm mentioning it rather than solving it.
> I left out the migration issue from my email also because it also affects
> both proposals equally.
> > In fact, adding that hook to IPAM at binding plus setting aside a 'not
> set' IP address might be all you need to do to make it possible.  The IPAM
> needs data to work out what an address is, but that doesn't have to take
> the form of existing Neutron constructs.
> What about the L2 network for each segment?  I suggested creating provider
> networks for these.  Do you have a different suggestion?
> What about distinguishing the bound address blocks from the mobile address
> blocks?  For example, the address blocks bound to the segments could be
> from a private space. A router port may get an address from this private
> space and be the next hop for public addresses.  Or, GoDaddy's model where
> vms get an address from the segment network and optionally a floating ip
> which is routed.
> Carl
>  __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150721/0f5765f0/attachment.html>

More information about the OpenStack-dev mailing list