[openstack-dev] [Neutron][L3] Representing a networks connected by routers
Carl Baldwin
carl at ecbaldwin.net
Mon Aug 3 18:27:11 UTC 2015
Kevin, sorry for the delay in response. Keeping up on this thread was
getting difficult while on vacation.
tl;dr: I think it is worth it to talk through the idea of inserting
some sort of a "subnet group thing" in the model to which floating ips
(and router external gateways) will associate. It has been on my mind
for a long time now. I didn't pursue it because a few informal
attempts to discuss it with others indicated to me that it would be a
difficult heavy-lifting job that others may not appreciate or
understand. Scroll to the bottom of this message for a little more on
this.
Carl
On Tue, Jul 28, 2015 at 1:15 AM, Kevin Benton <blak111 at gmail.com> wrote:
>>Also, in my proposal, it is more the router that is the grouping mechanism.
>
> I can't reconcile this with all of the points you make in the rest of your
> email. You want the collection of subnets that a network represents, but you
> don't want any other properties of the network.
This is closer to what I'm trying to say but isn't quite there. There
are some subnets that should be associated with the segments
themselves and there are some that should be associated with the
collection of segments. I want floating IPs that are not tied the an
L2 network. None of the alternate proposals that I'd heard addressed
this.
>>that the network object is currently the closest thing we have to
>> representing the L3 part of the network.
>
> The L3 part of a network is the subnets. You can't have IP addresses without
> the subnets, you can't have floating IPs without the subnets, etc.
You're right but in the current model you can't have IP addresses
without the network either which is actually the point I'm trying to
make.
> A Neutron network is an L2 construct that encapsulates L3 things. By
> encapsulating them it also provides an implicit grouping. The routed
> networks proposal basically wants that implicit grouping without the
> encapsulation or the L2 part.
This sounds about right. I think it is wrong to assume that we need
an L2 network to encapsulate L3 things. I'm feeling restricted by the
model and the insistence that a neutron network is a purely L2
construct.
>>We don't associate floating ips with a network because we want to arp for
>> them. You're taking a consequence of the current model and its constraints
>> and presenting that as the motivation for the model. We do so because there
>> is no better L3 object to associate it to.
>
> Don't make assumptions about how people use floating IPs now just because it
> doesn't fit your use-case well. When an external network is implemented as a
> real Neutron network (leaving external_network_bridge blank like we suggest
> in the networking guide), normal ports can be attached and can
> co-exist/communicate with the floating IPs because it behaves as an L2
> network exactly as implied by the API. The current model works quite well if
> your fabric can extend the external network everywhere it needs to be.
Yes, "when an external network is implemented as a real Neutron
network" all of this is true and my proposal doesn't change any of
this. I'm wasn't making any such assumptions. I acknowledge that the
current model works well in this case and didn't intend to change it
for current use cases. It is precisely that because it does not fit
my use case well that I'm pursuing this.
Notice that a network marked only as external doesn't allow normal
tenants to create ports. It must also be marked shared to allow it.
Unless tenants are creating regular ports then they really don't care
if arp or anything else L2 is involved because such an external
network is meant to give access external to the cloud where L2 is
really just an implementation detail. It is the deployer that cares
because of whatever infrastructure (like a gateway router) needs to
work with it. If the L2 is important, then the deployer will not
attempt to use an L3 only network, she will use the same kinds of
networks as always.
The bad assumption here is that floating IPs need an explicit
association with an L2 only construct: tenant's allocate a floating
IP by selecting the Neutron network it is recorded in the DB that way.
Tenant's aren't even allowed to see the subnets on an external
network. This is counter-intuitive to me because I believe that, in
most cases, tenants want a floating IP to get L3 access to the world
(or a part of it) that is external to Openstack. Yet, they can only
see the L2 object? These are the factors that make me view the
Neutron network as an L2 + L3 construct.
> If you don't want floating IPs to be reachable on the network they are
> associated with, then let's stop associating them with a network and instead
> start associating them with a group of subnets from which they allocate IPs.
Okay. I'm willing to take a serious look at this. This isn't merely
associating a floating IP with a subnet. The tenant shouldn't need to
know about the individual cidrs of the L3 network and wonder if there
is some significant difference or "flavor" of each. I think we truly
need something that represents the group of subnets as you said.
>>If we insist on a new object for the L3 part of a network then I'd say we
>> had better have an L3 only port to connect to it.
>
> I don't think a new port type is necessary. We can just make the network ID
> nullable for a port to indicate that it isn't attached to a Neutron network
> since it won't be. It would then have a relationship to its associated
> subnet via fixed_ips as it does now.
Is this really so different from what I'm trying to do with networks?
Make the L2 part nullable.
>>The subnet is not the L3 object that we're looking for. You may wish it
>> were but that does not make it so.
>
> I never said a subnet is what we are looking for. The group of subnets is
> what we seem to be after.
Agreed as I stated above.
>>Ignoring the forced dependence on L2, the subnets still don't stand alone
>> to describe just the L3 part, they must be linked to a network to make any
>> sense.
>
> They don't need to be. If we made the network_id nullable on ports and
> subnets, we could still have ports associated with subnets. The network
> portion is the L2 portion. You don't want L2 so don't ask for the network.
In your model, should the port be associated with the "group of
subnets" at all? I'm not sure I see a need for it to be directly
associated but I haven't thought it all the way through.
> I understand that we want a way to reference collections of subnets and
> create ports that allocate IPs from them. Networks provide that now, but
> they imply an L2 domain for the ports, which we don't want. So we are trying
> to change what a network implies for this one special case, which is going
> to have ripple effects everywhere.
>
> Here are some areas where I can already see we will need special-casing:
>
> DHCP agent scheduling - broadcast doesn't work on L3 networks, every compute
> node will need to use the direct tap attachment logic Neil brought up.
My proposal only created the ports on the network segments.
Admittedly, that is the ugliest part of the proposal but it did
obviate the need for DHCP or arp for the port.
> DHCP lease generation - a port can't get the normal subnet mask for the L3
> network it's attached to because it would try to ARP for addresses in the
> same subnet, which are actually somewhere else.
See above.
> Router interface attachment - what happens when a user attaches one
> interface to a regular network and one to an L3 network? Before they were
> all L2 networks so it didn't matter, but now none of the ports will be
> reachable on the L3 network without route table manipulation.
> (or) Router creation - to avoid the above you can have different router
> types that can only attach to one or the other.
Not sure I'm following here. The ports are all created on the segments.
> Every L2 attribute related to networks - we will need logic in the API that
> hides these fields or marks them as invalid and to generate an error if the
> user tries to update them.
> Multi-provider segments - We can't let a user add an L3 segment to a network
> with L2 segments (e.g. VXLAN, VLAN). Same goes for the inverse.
> Hierarchical port binding - coordinating ToRs for VXLAN+VLAN is l2 encap. L3
> would need route propagation logic instead.
All the ports are still connected to L2 network segments so I don't
think this is an issue.
> Every plugin, service, tool, etc, built on the assumption that a Neutron
> network is L2.
ok
> Port creation - If you aren't doing the full l3 like Neil's proposal, you
> need to intercept port creation requests and schedule the port to one of the
> underlying regular networks. The port then has a different network_id than
> the one requested, or we have more special-casing to hide that.
ok, this was called out and I admit it is the ugliest part of the proposal.
> All of those will be branches in the codebase to handle current Neutron
> networks vs L3 networks. If we go down this route, it will be even worse
> than the conditionals we have to support DVR in ML2 because we are exposing
> it via the API. It's technical debt that we will not be able to get rid of.
I don't think it is nearly as bad as you make it out to be.
> I would rather see something to reference a group of subnets that can be
> used for floating IP allocation and port creation in lieu of a network ID
> than the technical debt that conditionally redefining a network will bring.
I'm willing to discuss this further. In fact, it has been on my mind
for a while now. This is essentially where I started. I ended up
with my current proposal because I perceived a lot more difficulty in
doing this than in the proposal I created. But, your perspective from
the other side of the problem is worth considering.
I'm glad to see that at least one other person seems to be
understanding the problem here. This will have API and end-user
impact is it changes the way that end users interact with floating IPs
at least. It will also affect the way that neutron routers are
associated to a network. Today's use case where a user connects a
router to an external network will also change. To what extent do we
support backward compatibility for existing work-flows? For example,
can a user port an existing work-flow to a cloud where the "external
network" is now a group of subnets routed among segments instead of a
neutron network?
Carl
More information about the OpenStack-dev
mailing list