Sean M. Collins <sean at coreitpro.com> wrote: > Ihar Hrachyshka wrote: >> UPD: seems like enforcing instance mtu to 1400 indeed makes us pass >> forward >> into tempest: >> >> http://logs.openstack.org/59/265759/3/experimental/gate-grenade-dsvm-neutron-multinode/a167a59/console.html >> >> And there are only three failures there: >> >> http://logs.openstack.org/59/265759/3/experimental/gate-grenade-dsvm-neutron-multinode/a167a59/console.html#_2016-01-11_11_58_47_945 >> >> I also don’t see any RPC versioning related traces in service logs, >> which is >> a good sign. > > Just an update - we are still stuck on those three tempest tests. > > I was able to dig a bit and it looks like it's still an MTU issue. > > > http://logs.openstack.org/35/187235/14/experimental/gate-grenade-dsvm-neutron-multinode/c5eda62/logs/tempest.txt.gz#_2016-02-09_20_37_40_044 > > "SSHException: Error reading SSH protocol banner[Errno 104] Connection > reset by peer” Note that this time we get reset immediately instead of being stuck there until timeout. > > I tried pushing down a patch to cram network_device_mtu down to 1450 in > the hopes that it would do the trick - but sadly that didn't fix. I’m Actually, we already have 1450 for network_device_mtu for the job since: https://review.openstack.org/#/c/267847/4/devstack-vm-gate.sh Also, I added some interface state dump for worlddump, and here is how the main node networking setup looks like: http://logs.openstack.org/59/265759/20/experimental/gate-grenade-dsvm-neutron-multinode/d64a6e6/logs/worlddump-2016-01-30-164508.txt.gz br-ex: mtu = 1450 inside router: qg mtu = 1450, qr = 1450 So should be fine in this regard. I also set devstack locally enforcing network_device_mtu, and it seems to pass packets of 1450 size through. So it’s probably something tunneling packets to the subnode that fails for us, not local router-to-tap bits. I also see br-tun having 1500. Is it a problem? Probably not, but I admit I miss a lot in this topic so far. Also I see some qg-2c68fb65-21 device in the worlddump output from above in global namespace. The device has mtu = 1500. Which router does the device belong to?.. > going to have to keep digging. I am almost certain it's something that > Matt K (Sam-I-Am) has already made note of in his research. Actually, I don’t think Matt ran any tests for MTU that is reduced comparing to ‘standard’ 1500 size. It would be interesting to see how it goes in his lab with the limited mtu size we use in gate. Ihar