[Openstack-operators] How to deal with MTU issue on routers servicing vxlan tenant networks? (openstack-ansible with LXC containers)

Jean-Philippe Evrard jean-philippe at evrard.me
Wed Dec 6 14:12:51 UTC 2017


On 6 December 2017 at 09:09, David Young <davidy at funkypenguin.co.nz> wrote:
> An update to my reply below..
>
> I’ve realized that I need a per-network MTU defined in
> /etc/openstack_deploy/openstack_user_config.yml, so I’ve done the following:
>
> global_overrides:
> <snip>
>   provider_networks:
>     - network:
>         container_bridge: "br-mgmt"
>         <snip>
>         container_mtu: "1500"
>         <snip>
>     - network:
>         container_bridge: "br-vxlan"
>         container_mtu: "1550"
>         type: "vxlan"
>         <snip>
>     - network:
>         container_bridge: "br-vlan"
>         type: "flat"
>         net_name: "flat"
>         container_mtu: "1500"
>         <snip>
>     - network:
>         container_bridge: "br-vlan"
>         type: "vlan"
>         container_mtu: "1500"
>         <snip>
>     - network:
>         container_bridge: "br-storage"
>         type: "raw"
>         container_mtu: "9000"
>         group_binds:
>           - glance_api
>           - cinder_api
>           - cinder_volume
>           - nova_compute
>           - swift_proxy
>
> I think that gets me:
>
> VXLAN LXC interfaces will have an MTU of 1550 (necessary for “raw” 1500 from
> the instances)
> flat/vlan interfaces will have an MTU of 1500 (let’s be consistent)
> storage interfaces can have an MTU of 9000
>
> Then, I set the following in /etc/openstack_deploy/user_variables.yml:
>
> lxc_net_mtu: 1550
> lxc_container_default_mtu: 1550
>
> I don’t know whether this is redundant or not based on the above, but it
> seemed sensible.
>
> I’m rerunning the setup-everything.yml playbook, but still not sure whether
> the changes apply if there’s an existing LXC container defined. We’ll find
> out soon enough…
>
> Cheers,
> D
>
> On 06/12/2017 21:51, David Young wrote:
>
> Hello,
>
> Thanks for the reply, responses inline below:
>
> Hello,
>
> I haven't touched this for a while, but could you give us your user_*
> variable overrides?
>
> OK, here we go. Let me know if there’s a preferred way to send large data
> blocks - I considered a gist or a pastebin, but figured that having the
> content archived with the mailing list message would be the best result.
>
> I think the overrides is what you’re asking for? The only MTU-related
> override I have is “containermtu” for the vxlan network below. I expect it
> doesn’t actually _do anything though, because I can’t find the string
> “container_mtu” within any of the related ansible roles (see grep for
> container_mtu vs container_bridge below for illustration). I found
> https://bugs.launchpad.net/openstack-ansible/+bug/1678165 which looked
> related
>
> root at nbs-dh-09:~# grep container_mtu /etc/ansible/ -ri
> root at nbs-dh-09:~# grep container_bridge /etc/ansible/ -ri
> /etc/ansible/roles/plugins/library/provider_networks:#     container_bridge:
> "br-mgmt"
> /etc/ansible/roles/plugins/library/provider_networks:#     container_bridge:
> "br-vxlan"
> /etc/ansible/roles/plugins/library/provider_networks:#     container_bridge:
> "br-vlan"
> /etc/ansible/roles/plugins/library/provider_networks:#     container_bridge:
> "br-vlan"
> /etc/ansible/roles/plugins/library/provider_networks:#     container_bridge:
> "br-storage"
> /etc/ansible/roles/plugins/library/provider_networks:
> bind_device = net['network']['container_bridge']
> /etc/ansible/roles/os_neutron/doc/source/configure-network-services.rst:
> container_bridge: "br-vlan"
> root at nbs-dh-09:~#
>
> global_overrides:
>   internal_lb_vip_address: 10.76.76.11
>   #
>   # The below domain name must resolve to an IP address
>   # in the CIDR specified in haproxy_keepalived_external_vip_cidr.
>   # If using different protocols (https/http) for the public/internal
>   # endpoints the two addresses must be different.
>   #
>   external_lb_vip_address: openstack.dev.safenz.net
>   tunnel_bridge: "br-vxlan"
>   management_bridge: "br-mgmt"
>   provider_networks:
>     - network:
>         container_bridge: "br-mgmt"
>         container_type: "veth"
>         container_interface: "eth1"
>         ip_from_q: "container"
>         type: "raw"
>         group_binds:
>           - all_containers
>           - hosts
>         is_container_address: true
>         is_ssh_address: true
>     - network:
>         container_bridge: "br-vxlan"
>         container_type: "veth"
>         container_interface: "eth10"
>         container_mtu: "9000"
>         ip_from_q: "tunnel"
>         type: "vxlan"
>         range: "1:1000"
>         net_name: "vxlan"
>         group_binds:
>           - neutron_linuxbridge_agent
>     - network:
>         container_bridge: "br-vlan"
>         container_type: "veth"
>         container_interface: "eth12"
>         host_bind_override: "eth12"
>         type: "flat"
>         net_name: "flat"
>         group_binds:
>           - neutron_linuxbridge_agent
>     - network:
>         container_bridge: "br-vlan"
>         container_type: "veth"
>         container_interface: "eth11"
>         type: "vlan"
>         range: "1:4094"
>         net_name: "vlan"
>         group_binds:
>           - neutron_linuxbridge_agent
>     - network:
>         container_bridge: "br-storage"
>         container_type: "veth"
>         container_interface: "eth2"
>         ip_from_q: "storage"
>         type: "raw"
>         group_binds:
>           - glance_api
>           - cinder_api
>           - cinder_volume
>           - nova_compute
>           - swift_proxy
>
> Here are a few things I watch for mtu related discussions:
> 1) ``lxc_net_mtu``: It is used in lxc_hosts to define the lxc bridge.
>
> Aha. I didn’t know about this, it sounds like what I need. I’ll add this and
> report back.
>
> 2) Your compute nodes and your controller nodes need to have
> consistent mtus on their bridges.
>
> They are both configured for an MTU of 9000, but the controller nodes
> bridges’ drop their MTU to 1500 when the veth interface paired with the
> neutron-agent LXC container is joined to the bridge (bridges downgrade their
> MTU to the MTU of the lowest participating interface)
>
> 3) Neutron needs a configuration override.
>
> I’ve set this in neutron.conf on all neutron LXC containers, and on the
> compute nodes too:
> global_physnet_mtu = 1550
>
> And likewise in /etc/neutron/plugins/ml2/ml2_conf.ini:
>
> # Set a global MTU of 1550 (to allow VXLAN at 1500)
> path_mtu = 1550
>
> # Drop VLAN and FLAT providers back to 1500, to align with outside FWs
> physical_network_mtus = vlan:1500,flat:1500
>
> 4) the lxc containers need to be properly defined: each network should
> have a mtu defined, or alternatively, you can define a default mtu for
> all the networks defined in openstack_user_config with
> ``lxc_container_default_mtu``. (This one is the one that spawns up the
> veth pair to the lxc container)
>
> I didn’t know about this one either, it didn’t exist in any of the default
> ansible-provided sample configs, but now that I’ve grepped in the ansible
> roles for “mtu”, it’s obvious. I’ll try this too.
>
> root at nbs-dh-09:~# grep -ri lxc_container_default_mtu /etc/openstack_deploy/*
> root at nbs-dh-09:~# grep -ri lxc_container_default_mtu /etc/ansible/
> /etc/ansible/roles/lxc_container_create/defaults/main.yml:lxc_container_default_mtu:
> "1500"
> /etc/ansible/roles/lxc_container_create/templates/container-interface.ini.j2:lxc.network.mtu
> = {{ item.value.mtu|default(lxc_container_default_mtu) }}
> /etc/ansible/roles/lxc_container_create/templates/debian-interface.cfg.j2:
> mtu {{ item.value.mtu|default(lxc_container_default_mtu) }}
> /etc/ansible/roles/lxc_container_create/templates/rhel-interface.j2:MTU={{
> item.value.mtu|default(lxc_container_default_mtu) }}
> root at nbs-dh-09:~#
>
> 5) The container interfaces need to have this proper mtu. This is
> taking the same configuration as 4) above, so it should work out of
> the box.
>
> Agreed, that seems to be the case currently with 1500, I’d expect it to be
> true with the updated value
>
> 6) If your instance is reaching its router with no mtu issue, you may
> still have issues for the Northbound trafic. Check how you configured
> this northbound and if the interfaces have proper mtu. If there are
> veth pairs to create pseudo links, check their mtus too.
>
> I think it's a good start for the conversation...
>
> Thank you, this is very helpful. I’ll give it a try and respond.
>
> Re #1 and #4, do I need to destroy / recreate my existing LXC containers, or
> will rerunning the playbooks be enough to update the MTUs?
>
> Many thanks,
> David


Hello,

For the mtu, it would be impactful to do it on a live environment. I
expect that if you change the container configuration, it would
restart.

Could you please tell me if this configuration was good enough for
your use case?
Or if the docs need adapting?

If this still doesn't work, maybe you should file a bug with your new
openstack_user_config
and the appropriate user_*.yml file. That would follow our bug triage
process where more ppl can have a look at the issue.

As usual, don't hesitate to come on our irc channel #openstack-ansible
if you have further questions!

Thank you!

Best regards,
Jean-Philippe Evrard
@evrardjp



More information about the OpenStack-operators mailing list