[openstack-dev] [Neutron] MTU configuration pain

Kevin Benton blak111 at gmail.com
Mon Jan 25 06:12:09 UTC 2016


>The reason for that was in the other half of the thread - it's not
possible to magically discover these things from within Openstack's own
code because the relevant settings span more than just one server

IMO it's better to have a default of 1500 rather than let VMs automatically
default to 1500 because at least we will deduct the encap header length
when necessary in the dhcp/ra advertised value so overlays work on standard
1500 MTU networks.

In other words, our current empty default is realistically a terrible
default of 1500 that doesn't account for network segmentation overhead.
On Jan 24, 2016 23:00, "Ian Wells" <ijw.ubuntu at cack.org.uk> wrote:

> On 24 January 2016 at 20:18, Kevin Benton <blak111 at gmail.com> wrote:
>
>> I believe the issue is that the default is unspecified, which leads to
>> nothing being advertised to VMs via dhcp/ra. So VMs end up using 1500,
>> which leads to a catastrophe when running on an overlay on a 1500 underlay.
>>
> That's not quite the point I was making here, but to answer that: looks to
> me like (for the LB or OVS drivers to appropriately set the network MTU for
> the virtual network, at which point it will be advertised because
> advertise_mtu defaults to True in the code) you *must* set one or more of
> path_mtu (for L3 overlays), segment_mtu (for L2 overlays) or physnet_mtu
> (for L2 overlays with differing MTUs on different physical networks).
> That's a statement of faith - I suspect if we try it we'll find a few
> niggling problems - but I can find the code, at least.
>
> The reason for that was in the other half of the thread - it's not
> possible to magically discover these things from within Openstack's own
> code because the relevant settings span more than just one server.  They
> have to line up with both your MTU settings for the interfaces in use, and
> the MTU settings for the other equipment within and neighbouring the cloud
> - switches, routers, nexthops.  So they have to be provided by the operator
> - then everything you want should kick in.
>
> If all of that is true, it really is just a documentation problem - we
> have the idea in place, we're just not telling people how to make use of
> it.  We can also include a checklist or a check script with that
> documentation - you might not be able to deduce the MTU values, but you can
> certainly run some checks to see if the values you have been given are
> obviously wrong.
>
> In the meantime, Matt K, you said you hadn't set path_mtu in your tests,
> but [1] says you have to ([1] is far from end-user consumable
> documentation, which again illustrates our problem).
>
> Can you set both path_mtu and segment_mtu to whatever value your switch
> MTU is (1500 or 9000), confirm your outbound interface MTU is the same
> (1500 or 9000), and see if that changes things?  At this point, you should
> find that your networks get appropriate 1500/9000 MTUs on VLAN based
> networks and 1450/8950 MTUs on VXLAN networks, that they're advertised to
> your VMs via DHCP and RA, and that your routers even know that different
> interfaces have different MTUs in a mixed environment, at least if
> everything is working as intended.
> --
> Ian.
>
> [1]
> https://github.com/openstack/neutron/blob/544ff57bcac00720f54a75eb34916218cb248213/releasenotes/notes/advertise_mtu_by_default-d8b0b056a74517b8.yaml#L5
>
>
>> On Jan 24, 2016 20:48, "Ian Wells" <ijw.ubuntu at cack.org.uk> wrote:
>>
>>> On 23 January 2016 at 11:27, Adam Lawson <alawson at aqorn.com> wrote:
>>>
>>>> For the sake of over-simplification, is there ever a reason to NOT
>>>> enable jumbo frames in a cloud/SDN context where most of the traffic is
>>>> between virtual elements that all support it? I understand that some
>>>> switches do not support it and traffic form the web doesn't support it
>>>> either but besides that, seems like a default "jumboframes = 1" concept
>>>> would work just fine to me.
>>>>
>>>
>>> Offhand:
>>>
>>> 1. you don't want the latency increase that comes with 9000 byte
>>> packets, even if it's tiny (bearing in mind that in a link shared between
>>> tenants it affects everyone when one packet holds the line for 6 times
>>> longer)
>>> 2. not every switch in the world is going to (a) be configurable or (b)
>>> pass 9000 byte packets
>>> 3. not every VM has a configurable MTU that you can set on boot, or
>>> supports jumbo frames, and someone somewhere will try and run one of those
>>> VMs
>>> 4. when you're using provider networks, not every device attached to the
>>> cloud has a 9000 MTU (and this one's interesting, in fact, because it
>>> points to the other element the MTU spec was addressing, that *not all
>>> networks, even in Neutron, will have the same MTU*).
>>> 5. similarly, if you have an external network in Openstack, and you're
>>> using VXLAN, the MTU of the external network is almost certainly 50 bytes
>>> bigger than that of the inside of the VXLAN overlays, so no one number can
>>> ever be right for every network in Neutron.
>>>
>>> Also, I say 9000, but why is 9000 even the right number?  We need a
>>> number... and 'jumbo' is not a number.  I know devices that will let you
>>> transmit 9200 byte packets.  Conversely, if the native L2 is 9000 bytes,
>>> then the MTU in a Neutron virtual network is less than 9000 - so what MTU
>>> do you want to offer your applications?  If your apps don't care, why not
>>> tell them what MTU they're getting (e.g. 1450) and be done with it?
>>> (Memory says that the old problem with that was that github had problems
>>> with PMTUD in that circumstance, but I don't know if that's still true, and
>>> even if it is it's not technically our problem.)
>>>
>>> Per the spec, I would like to see us do the remaining fixes to make that
>>> work as intended - largely 'tell the VMs what they're getting' - and then,
>>> as others have said, lay out simple options for deployments, be they jumbo
>>> frame or otherwise.
>>>
>>> If you're seeing MTU related problems at this point, can you file bugs
>>> on them and/or report back the bugs here, so that we can see what we're
>>> actually facing?
>>> --
>>> Ian.
>>>
>>>
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160124/88125253/attachment.html>


More information about the OpenStack-dev mailing list