[neutron] Multi-segment per host support for routed networks
Neutron land, Problem: Neutron currently only allows a single network segment per host. This becomes a problem when networking teams want to limit the number of IPs it supports on a segment. This means that at times the number of IPs available to the host is the limiting factor for the number of instances we can deploy on a host. Ref: https://bugs.launchpad.net/neutron/+bug/1764738 Ongoing Work: We are excited in our work add "multi-segment support for routed networks". We currently have a proof of concept here https://review.openstack.org/#/c/623115 that for routed networks effectively: * Removes validation preventing multiple segments. * Injects segment_id into fixed IP records. * Uses the segment_id when creating a bridge (rather than network_id). In effect, it gives each segment its own bridge. It works pretty well for new networks and deployments. For existing routed networks, however, it breaks networking. Please use *caution* if you decide to try it. TODOs: Things TODO before this before it is fully baked: * Need to add code to handle ensuring bridges are also updated/deleted using the segment_id (rather than network_id). * Need to add something (a feature flag?) that prevents this from breaking routed networks when a cloud admin updates to master and is configured for routed networks. * Need to create checker and upgrade migration code that will convert existing bridges from network_id based to segment_id based (ideally live or with little network traffic downtime). Once converted, the feature flag could enable the feature and start using the new code. Need: 1. How does one go about adding a migration tool? Maybe some examples? 2. Will nova need to be notified/upgraded to have bridge related files updated? 3. Is there a way to migrate without (or minimal) downtime? 4. How to repeatably test this migration code? Grenade? Looking for any ideas that can keep this moving :) Thanks a ton, David Bingham (wwriverrat on irc) Kris Lindgren (klindgren on irc) Cloud Engineers at GoDaddy
Hi,
Wiadomość napisana przez David G. Bingham <dbingham@godaddy.com> w dniu 01.02.2019, o godz. 19:16:
Neutron land,
Problem: Neutron currently only allows a single network segment per host. This becomes a problem when networking teams want to limit the number of IPs it supports on a segment. This means that at times the number of IPs available to the host is the limiting factor for the number of instances we can deploy on a host. Ref: https://bugs.launchpad.net/neutron/+bug/1764738
Ongoing Work: We are excited in our work add "multi-segment support for routed networks". We currently have a proof of concept here https://review.openstack.org/#/c/623115 that for routed networks effectively: * Removes validation preventing multiple segments. * Injects segment_id into fixed IP records. * Uses the segment_id when creating a bridge (rather than network_id). In effect, it gives each segment its own bridge.
It works pretty well for new networks and deployments. For existing routed networks, however, it breaks networking. Please use *caution* if you decide to try it.
TODOs: Things TODO before this before it is fully baked: * Need to add code to handle ensuring bridges are also updated/deleted using the segment_id (rather than network_id). * Need to add something (a feature flag?) that prevents this from breaking routed networks when a cloud admin updates to master and is configured for routed networks. * Need to create checker and upgrade migration code that will convert existing bridges from network_id based to segment_id based (ideally live or with little network traffic downtime). Once converted, the feature flag could enable the feature and start using the new code.
Need: 1. How does one go about adding a migration tool? Maybe some examples?
I’m not sure if this can be similar but I know that networking-ovn project has some migration tool to migrate from ml2/ovs to ml2/ovn solution. Maybe this can be somehow helpful for You.
2. Will nova need to be notified/upgraded to have bridge related files updated?
Probably someone from Nova team should answer to that. Maybe Sean Mooney would be good person to ask?
3. Is there a way to migrate without (or minimal) downtime? 4. How to repeatably test this migration code? Grenade?
Again, check how networking-ovn did it, maybe You will be able to do something similar :)
Looking for any ideas that can keep this moving :)
Thanks a ton,
David Bingham (wwriverrat on irc) Kris Lindgren (klindgren on irc) Cloud Engineers at GoDaddy
— Slawek Kaplonski Senior software engineer Red Hat
Slawomir and others, So, we got a little deeper into the "core" fix for this and have some concerns around the "network_map" that is part of CommonAgentManagerRpcCallBackBase. Throughout the code we see references to this map storing a segment object. This fundamentally makes its storage a one-to-one mapping (key=network_id, value=segment) rather than allowing many segments for a network. So, we're looking for core suggestions about how deep this solution goes to allow a one-to-many relationship? Some thoughts: 1) Refactor network_map to become segment_map and store it by its segment_id 2) Refactor network_map to internally store a many relationship: key=network_id, value=dict(key=segment.id, value=segment) 3) Other ideas? Each of the above has its own evils References: * https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers...: self.network_map = {} * https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers...: adds a segment object as if it is a network * https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers...: assumes it gets individual segment using network_id * Many, many other references to "network_map" it's using "network_id" to get the individual segment especially from the inheriting agent driver classes. On 2/13/19, 3:07 PM, "Slawomir Kaplonski" <skaplons@redhat.com> wrote: Hi, > Wiadomość napisana przez David G. Bingham <dbingham@godaddy.com> w dniu 01.02.2019, o godz. 19:16: > > Neutron land, > > Problem: > Neutron currently only allows a single network segment per host. This > becomes a problem when networking teams want to limit the number of IPs it > supports on a segment. This means that at times the number of IPs available to > the host is the limiting factor for the number of instances we can deploy on a > host. Ref: https://bugs.launchpad.net/neutron/+bug/1764738 > > Ongoing Work: > We are excited in our work add "multi-segment support for routed networks". > We currently have a proof of concept here https://review.openstack.org/#/c/623115 > that for routed networks effectively: > * Removes validation preventing multiple segments. > * Injects segment_id into fixed IP records. > * Uses the segment_id when creating a bridge (rather than network_id). > In effect, it gives each segment its own bridge. > > It works pretty well for new networks and deployments. For existing > routed networks, however, it breaks networking. Please use *caution* if you > decide to try it. > > TODOs: > Things TODO before this before it is fully baked: > * Need to add code to handle ensuring bridges are also updated/deleted using > the segment_id (rather than network_id). > * Need to add something (a feature flag?) that prevents this from breaking > routed networks when a cloud admin updates to master and is configured for > routed networks. > * Need to create checker and upgrade migration code that will convert existing > bridges from network_id based to segment_id based (ideally live or with > little network traffic downtime). Once converted, the feature flag could > enable the feature and start using the new code. > > Need: > 1. How does one go about adding a migration tool? Maybe some examples? I’m not sure if this can be similar but I know that networking-ovn project has some migration tool to migrate from ml2/ovs to ml2/ovn solution. Maybe this can be somehow helpful for You. > 2. Will nova need to be notified/upgraded to have bridge related files updated? Probably someone from Nova team should answer to that. Maybe Sean Mooney would be good person to ask? > 3. Is there a way to migrate without (or minimal) downtime? > 4. How to repeatably test this migration code? Grenade? Again, check how networking-ovn did it, maybe You will be able to do something similar :) > > Looking for any ideas that can keep this moving :) > > Thanks a ton, > > David Bingham (wwriverrat on irc) > Kris Lindgren (klindgren on irc) > Cloud Engineers at GoDaddy > — Slawek Kaplonski Senior software engineer Red Hat
participants (2)
-
David G. Bingham
-
Slawomir Kaplonski