[openstack-dev] Neutron and MTU advertisements -- post newton

Matt Kassawara mkassawara at gmail.com
Mon Jul 11 18:36:48 UTC 2016


All MTU changes must occur on a layer-3 device, typically a router. If a
router receives a packet with an MTU larger than the next-hop interface, it
can either fragment it or use path MTU discovery (PMTUD) to inform the
sender of the next-hop interface MTU value [1]. Fragmentation causes
performance problems and all modern layer-3 devices support PMTUD. Thus,
unless explicitly disabled (or broken by blocking ICMP), PMTUD provides an
optimal solution that enables the sender to retransmit the initial packet
and any future packets without fragmentation.

[1] https://en.wikipedia.org/wiki/IP_fragmentation

On Mon, Jul 11, 2016 at 11:18 AM, Sam Yaple <samuel at yaple.net> wrote:

> On Mon, Jul 11, 2016 at 4:39 PM, Jay Pipes <jaypipes at gmail.com> wrote:
>
>> On 07/11/2016 07:45 AM, Sam  Yaple wrote:
>>
>>> Hello,
>>>
>>> There was alot of work to get MTU advertisement working well in Mitaka.
>>> With the work that was done we can finally have 1500 mtu networks for
>>> tunneled networks if the underlying infrastructure supports jumbo frames.
>>>
>>> Its fantastic for people who have 1500 mtu networks and want to use
>>> vxlan, no more hacks to get the instance to use 1450 mtu. Its fantastic
>>> for people who want to use 1500+ networks and get the instances setup
>>> with 9000 mtu interfaces. Its is not good for people who want consistent
>>> mtu no matter the network type. But thats fine, since mtu advertisement
>>> _could_ be disabled. Its a fantastic default to have it turned on.
>>>
>>> With a recent patchset [1] the ability to turn off MTU advertisements
>>> was deprecated in Newton. In the review it was stated there is no valid
>>> use case for it. I disagree.
>>>
>>> The scenario is the infrastructure has jumbo frames enabled, but I do
>>> not want the instances to be using jumbo frames, but I want them to be
>>> using the default 1500 mtu that the rest of the world operates on. This
>>> would still setup all of the virtual switching infrastructure to the
>>> correct MTUs, but not try to adjust the instances MTUs. In this way the
>>> instances are only communicating at 1500 mtu, but never having to
>>> fragment/drop inside of the SDN when communicating with other networks
>>> even if it is a VXLAN or other tunneled network.
>>>
>>> Without the option to disable mtu advertisement, inside the same
>>> environment flat/vlan and gre/vxlan network will always have different
>>> mtu, even if the backend supports jumbo frames.
>>>
>>> My ask is we keep the advertise_mtu option, and keep it enabled by
>>> default. This would allow for the default, common 1500 mtu across
>>> networks of different types.
>>>
>>> This scenario would be very similiar to having a computer with 1500 mtu
>>> attached to a switch which supports jumbo frames. Just because the
>>> switch will accept and process a 9000 mtu frame, doesnt mean the
>>> computer has to send a 9000 mtu frame. A very common scenario in the
>>> real world.
>>>
>>> [1] https://review.openstack.org/#/c/310448/
>>>
>>
>> Hi Sam,
>>
>> Out of curiosity, in what scenarios is it better to limit the instance's
>> MTU to a value lower than that of the maximum path MTU of the
>> infrastructure? In other words, if the infrastructure supports jumbo
>> frames, why artificially limit an instance's MTU to 1500 if that instance
>> could theoretically communicate with other instances in the same
>> infrastructure via a higher MTU?
>>
>> Hey Jay,
>
> A not-so-uncommon way to setup networks in neutron involves the use of 1:1
> NATs. You have a firewall device that holds real, valid public addresses
> that map to private addresses (RFC-1918). So to OpenStack the network
> appears as a private network, but some of those address map to public
> addresses outside of OpenStack's sphere of knowledge. This works really,
> really well when you have multiple separate ranges of public ip addresses
> and separate gateways for each and you want to use them without creating
> multiple subnets on an external network with OpenStack. This has been
> written about in blog posts [1] and used in enterprise environments (it is
> what Rackspace does for their private cloud [2]).
>
> In this situation, since you are mapping real-ips and the real world runs
> on 1500 mtu, you want to make sure your MTUs match in ways that cannot be
> auto-discovered. A good way to do this is to just use the default 1500 mtu
> for every instance and ensure that that never fragments (which means at
> least 1550 mtu networks for vxlan). So in this case you would have setup
> your network in such a way that a 1500 mtu frame from the internet can
> arrive at your instance without ever being fragment, and outgoing traffic
> isn't trying to send >1500mtu packets into the real internet.
>
> Additionally, there may be other services using the interface (it is not
> dedicated to just neutron traffic) such as ceph which loves high MTUs. I
> mention this as a secondary point because neutron doesn't affect this at
> all, but it is related to your question.
>
> [1] http://dachary.org/?p=2466
> [2] https://developer.rackspace.com/blog/neutron-networking-l3-agent/
>
> Sorry if my question is poorly worded. I'm no networking expert and am
>> genuinely curious here. :)
>>
>> Best,
>> -jay
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160711/77619f86/attachment.html>


More information about the OpenStack-dev mailing list