[neutron] Multi-segment per host support for routed networks

David G. Bingham dbingham at godaddy.com
Mon Mar 4 23:09:52 UTC 2019


Slawomir and others,

So, we got a little deeper into the "core" fix for this and have some concerns around the "network_map" that is part of CommonAgentManagerRpcCallBackBase. Throughout the code we see references to this map storing a segment object. This fundamentally makes its storage a one-to-one mapping (key=network_id, value=segment) rather than allowing many segments for a network.

So, we're looking for core suggestions about how deep this solution goes to allow a one-to-many relationship?
Some thoughts:
1) Refactor network_map to become segment_map and store it by its segment_id
2) Refactor network_map to internally store a many relationship: key=network_id, value=dict(key=segment.id, value=segment)
3) Other ideas?
Each of the above has its own evils

References:
* https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/agent/_agent_manager_base.py#L43: self.network_map = {}
* https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/agent/_agent_manager_base.py#L62-L68: adds a segment object as if it is a network
* https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py#L896: assumes it gets individual segment using network_id
* Many, many other references to "network_map" it's using "network_id" to get the individual segment especially from the inheriting agent driver classes.

On 2/13/19, 3:07 PM, "Slawomir Kaplonski" <skaplons at redhat.com> wrote:

    Hi,
    
    > Wiadomość napisana przez David G. Bingham <dbingham at godaddy.com> w dniu 01.02.2019, o godz. 19:16:
    > 
    > Neutron land,
    > 
    > Problem:
    > Neutron currently only allows a single network segment per host. This
    > becomes a problem when networking teams want to limit the number of IPs it
    > supports on a segment. This means that at times the number of IPs available to
    > the host is the limiting factor for the number of instances we can deploy on a
    > host. Ref: https://bugs.launchpad.net/neutron/+bug/1764738
    > 
    > Ongoing Work:
    > We are excited in our work add "multi-segment support for routed networks".
    > We currently have a proof of concept here https://review.openstack.org/#/c/623115
    > that for routed networks effectively:
    > * Removes validation preventing multiple segments.
    > * Injects segment_id into fixed IP records.
    > * Uses the segment_id when creating a bridge (rather than network_id).
    > In effect, it gives each segment its own bridge.
    > 
    > It works pretty well for new networks and deployments. For existing
    > routed networks, however, it breaks networking. Please use *caution* if you
    > decide to try it.
    > 
    > TODOs:
    > Things TODO before this before it is fully baked:
    > * Need to add code to handle ensuring bridges are also updated/deleted using
    >  the segment_id (rather than network_id).
    > * Need to add something (a feature flag?) that prevents this from breaking
    >  routed networks when a cloud admin updates to master and is configured for
    >  routed networks.
    > * Need to create checker and upgrade migration code that will convert existing
    >  bridges from network_id based to segment_id based (ideally live or with
    >  little network traffic downtime). Once converted, the feature flag could
    >  enable the feature and start using the new code.
    > 
    > Need:
    > 1. How does one go about adding a migration tool? Maybe some examples?
    
    I’m not sure if this can be similar but I know that networking-ovn project has some migration tool to migrate from ml2/ovs to ml2/ovn solution. Maybe this can be somehow helpful for You.
    
    > 2. Will nova need to be notified/upgraded to have bridge related files updated?
    
    Probably someone from Nova team should answer to that. Maybe Sean Mooney would be good person to ask?
    
    > 3. Is there a way to migrate without (or minimal) downtime?
    > 4. How to repeatably test this migration code? Grenade?
    
    Again, check how networking-ovn did it, maybe You will be able to do something similar :)
    
    > 
    > Looking for any ideas that can keep this moving :)
    > 
    > Thanks a ton,
    > 
    > David Bingham (wwriverrat on irc)
    > Kris Lindgren (klindgren on irc)
    > Cloud Engineers at GoDaddy
    > 
    
    — 
    Slawek Kaplonski
    Senior software engineer
    Red Hat
    
    



More information about the openstack-discuss mailing list