[Openstack-operators] How to deal with MTU issue on routers servicing vxlan tenant networks? (openstack-ansible with LXC containers)

David Young davidy at funkypenguin.co.nz
Tue Dec 12 09:20:47 UTC 2017


Hey Jean-Philippe,

No, after I disasterously split-brained/partitioned my rabbitmq and 
galera clusters by allowing LXC to start the containers up without the 
dnsmasq process to address their eth0 interfaces (due to what _may_ be a 
template/Xenial bug), I've spent the last few days cleaning upthe mess:)

I have twounused hosts set aside as a test environment for pre-testing, 
and I'll be leveraging these in the next few days to test theissue on a 
fresh Xenial install.

I'll update you (and the list) once I've positively confirmed the issue.

Cheers!
D




On 12/12/2017 21:52, Jean-Philippe Evrard wrote:
> Hello David,
>
> Did you solve your issue?
> Did you check that it depends on the default container interface's mtu itself?
>
> Best regards,
> JP
>
>
> On 6 December 2017 at 18:45, David Young <davidy at funkypenguin.co.nz> wrote:
>> So..
>>
>> On 07/12/2017 03:12, Jean-Philippe Evrard wrote:
>>
>> For the mtu, it would be impactful to do it on a live environment. I
>> expect that if you change the container configuration, it would
>> restart.
>>
>> It’s a busy lab environment, but given that it’s fully HA (2 controllers), I
>> didn’t anticipate a significant problem with changing container
>> configuration one-at-a-time.
>>
>> However, the change has had an unexpected side effect - one of the
>> controllers (I haven’t rebooted the other one yet) seems to have lost the
>> ability to bring up lxcbr0, and so while it can start all its containers,
>> none of them have any management connectivity on eth0, which of course
>> breaks all sorts of things.
>>
>> I.e.
>>
>> root at nbs-dh-10:~# systemctl status networking.service
>> ● networking.service - Raise network interfaces
>>     Loaded: loaded (/lib/systemd/system/networking.service; enabled; vendor
>> preset: enabled)
>>    Drop-In: /run/systemd/generator/networking.service.d
>>             └─50-insserv.conf-$network.conf
>>     Active: failed (Result: exit-code) since Thu 2017-12-07 06:37:00 NZDT;
>> 14min ago
>>       Docs: man:interfaces(5)
>>    Process: 2717 ExecStart=/sbin/ifup -a --read-environment (code=exited,
>> status=1/FAILURE)
>>    Process: 2656 ExecStartPre=/bin/sh -c [ "$CONFIGURE_INTERFACES" != "no" ]
>> && [ -n "$(ifquery --read-environment --list --exclude=lo)" ] && udevadm
>> settle (code=e
>>   Main PID: 2717 (code=exited, status=1/FAILURE)
>>
>> Dec 07 06:36:58 nbs-dh-10 systemd[1]: Starting Raise network interfaces...
>> Dec 07 06:36:58 nbs-dh-10 ifup[2717]: RTNETLINK answers: Invalid argument
>> Dec 07 06:36:58 nbs-dh-10 ifup[2717]: /sbin/ifup: waiting for lock on
>> /run/network/ifstate.enp4s0
>> Dec 07 06:36:58 nbs-dh-10 ifup[2717]: /sbin/ifup: waiting for lock on
>> /run/network/ifstate.br-mgmt
>> Dec 07 06:37:00 nbs-dh-10 ifup[2717]: /sbin/ifup: waiting for lock on
>> /run/network/ifstate.br-vlan
>> Dec 07 06:37:00 nbs-dh-10 ifup[2717]: Failed to bring up lxcbr0.
>> Dec 07 06:37:00 nbs-dh-10 systemd[1]: networking.service: Main process
>> exited, code=exited, status=1/FAILURE
>> Dec 07 06:37:00 nbs-dh-10 systemd[1]: Failed to start Raise network
>> interfaces.
>> Dec 07 06:37:00 nbs-dh-10 systemd[1]: networking.service: Unit entered
>> failed state.
>> Dec 07 06:37:00 nbs-dh-10 systemd[1]: networking.service: Failed with result
>> 'exit-code'.
>> root at nbs-dh-10:~#
>>
>> I’ve manually reversed the “lxc.network.mtu = 1550” entry in
>> /etc/lxc/lxc-openstack.conf, but this doesn’t seem to have made a
>> difference.
>>
>> What’s also odd is that lxcbr0 appears to be perfectly normal:
>>
>> root at nbs-dh-10:~# brctl show lxcbr0
>> bridge name    bridge id        STP enabled    interfaces
>> lxcbr0        8000.fe0a7fa28303    no        04063403_eth0
>>                              075266dc_eth0
>>                              160c9b30_eth0
>>                              38ac19ae_eth0
>>                              4f57300f_eth0
>>                              59b2b5a5_eth0
>>                              5b7bbeb4_eth0
>>                              64a1fcdd_eth0
>>                              6c99f5fe_eth0
>>                              6f93ebb2_eth0
>>                              70ce61e5_eth0
>>                              745ba80d_eth0
>>                              85df2fa5_eth0
>>                              99e6adf8_eth0
>>                              cbdfa2f3_eth0
>>                              e15dc279_eth0
>>                              ea67ce7e_eth0
>>                              ed5c7af9_eth0
>> root at nbs-dh-10:~#
>>
>> … But, no matter the value of lxc.network.mtu, it doesn’t change from 1500
>> (I suppose this could actually have reduced itself based on the lower MTUs
>> of the member interfaces though):
>>
>> root at nbs-dh-10:~# ifconfig lxcbr0
>> lxcbr0    Link encap:Ethernet  HWaddr fe:0c:5d:1c:36:da
>>            inet addr:10.0.3.1  Bcast:10.0.3.255  Mask:255.255.255.0
>>            inet6 addr: fe80::f4b0:bff:fec3:63b0/64 Scope:Link
>>            UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>            RX packets:499 errors:0 dropped:0 overruns:0 frame:0
>>            TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
>>            collisions:0 txqueuelen:1000
>>            RX bytes:128882 (128.8 KB)  TX bytes:828 (828.0 B)
>>
>> root at nbs-dh-10:~#
>>
>> Any debugging suggestions?
>>
>> Thanks,
>> D

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20171212/c027435f/attachment.html>


More information about the OpenStack-operators mailing list