[openstack-dev] [Neutron] MTU configuration pain
Ian Wells
ijw.ubuntu at cack.org.uk
Mon Jan 25 06:26:51 UTC 2016
On 24 January 2016 at 22:12, Kevin Benton <blak111 at gmail.com> wrote:
> >The reason for that was in the other half of the thread - it's not
> possible to magically discover these things from within Openstack's own
> code because the relevant settings span more than just one server
>
> IMO it's better to have a default of 1500 rather than let VMs
> automatically default to 1500 because at least we will deduct the encap
> header length when necessary in the dhcp/ra advertised value so overlays
> work on standard 1500 MTU networks.
>
> In other words, our current empty default is realistically a terrible
> default of 1500 that doesn't account for network segmentation overhead.
>
It's pretty clear that, while the current setup is precisely the old
behaviour (backward compatibility, y'know?), it's not very useful. Problem
is, anyone using the 1550+hacks and other methods of today will find their
system changes behaviour if we started setting that specific default.
Regardless, we need to take that documentation and update it. It was a
nasty hack back in the day and not remotely a good idea now.
> On Jan 24, 2016 23:00, "Ian Wells" <ijw.ubuntu at cack.org.uk> wrote:
>
>> On 24 January 2016 at 20:18, Kevin Benton <blak111 at gmail.com> wrote:
>>
>>> I believe the issue is that the default is unspecified, which leads to
>>> nothing being advertised to VMs via dhcp/ra. So VMs end up using 1500,
>>> which leads to a catastrophe when running on an overlay on a 1500 underlay.
>>>
>> That's not quite the point I was making here, but to answer that: looks
>> to me like (for the LB or OVS drivers to appropriately set the network MTU
>> for the virtual network, at which point it will be advertised because
>> advertise_mtu defaults to True in the code) you *must* set one or more of
>> path_mtu (for L3 overlays), segment_mtu (for L2 overlays) or physnet_mtu
>> (for L2 overlays with differing MTUs on different physical networks).
>> That's a statement of faith - I suspect if we try it we'll find a few
>> niggling problems - but I can find the code, at least.
>>
>> The reason for that was in the other half of the thread - it's not
>> possible to magically discover these things from within Openstack's own
>> code because the relevant settings span more than just one server. They
>> have to line up with both your MTU settings for the interfaces in use, and
>> the MTU settings for the other equipment within and neighbouring the cloud
>> - switches, routers, nexthops. So they have to be provided by the operator
>> - then everything you want should kick in.
>>
>> If all of that is true, it really is just a documentation problem - we
>> have the idea in place, we're just not telling people how to make use of
>> it. We can also include a checklist or a check script with that
>> documentation - you might not be able to deduce the MTU values, but you can
>> certainly run some checks to see if the values you have been given are
>> obviously wrong.
>>
>> In the meantime, Matt K, you said you hadn't set path_mtu in your tests,
>> but [1] says you have to ([1] is far from end-user consumable
>> documentation, which again illustrates our problem).
>>
>> Can you set both path_mtu and segment_mtu to whatever value your switch
>> MTU is (1500 or 9000), confirm your outbound interface MTU is the same
>> (1500 or 9000), and see if that changes things? At this point, you should
>> find that your networks get appropriate 1500/9000 MTUs on VLAN based
>> networks and 1450/8950 MTUs on VXLAN networks, that they're advertised to
>> your VMs via DHCP and RA, and that your routers even know that different
>> interfaces have different MTUs in a mixed environment, at least if
>> everything is working as intended.
>> --
>> Ian.
>>
>> [1]
>> https://github.com/openstack/neutron/blob/544ff57bcac00720f54a75eb34916218cb248213/releasenotes/notes/advertise_mtu_by_default-d8b0b056a74517b8.yaml#L5
>>
>>
>>> On Jan 24, 2016 20:48, "Ian Wells" <ijw.ubuntu at cack.org.uk> wrote:
>>>
>>>> On 23 January 2016 at 11:27, Adam Lawson <alawson at aqorn.com> wrote:
>>>>
>>>>> For the sake of over-simplification, is there ever a reason to NOT
>>>>> enable jumbo frames in a cloud/SDN context where most of the traffic is
>>>>> between virtual elements that all support it? I understand that some
>>>>> switches do not support it and traffic form the web doesn't support it
>>>>> either but besides that, seems like a default "jumboframes = 1" concept
>>>>> would work just fine to me.
>>>>>
>>>>
>>>> Offhand:
>>>>
>>>> 1. you don't want the latency increase that comes with 9000 byte
>>>> packets, even if it's tiny (bearing in mind that in a link shared between
>>>> tenants it affects everyone when one packet holds the line for 6 times
>>>> longer)
>>>> 2. not every switch in the world is going to (a) be configurable or (b)
>>>> pass 9000 byte packets
>>>> 3. not every VM has a configurable MTU that you can set on boot, or
>>>> supports jumbo frames, and someone somewhere will try and run one of those
>>>> VMs
>>>> 4. when you're using provider networks, not every device attached to
>>>> the cloud has a 9000 MTU (and this one's interesting, in fact, because it
>>>> points to the other element the MTU spec was addressing, that *not all
>>>> networks, even in Neutron, will have the same MTU*).
>>>> 5. similarly, if you have an external network in Openstack, and you're
>>>> using VXLAN, the MTU of the external network is almost certainly 50 bytes
>>>> bigger than that of the inside of the VXLAN overlays, so no one number can
>>>> ever be right for every network in Neutron.
>>>>
>>>> Also, I say 9000, but why is 9000 even the right number? We need a
>>>> number... and 'jumbo' is not a number. I know devices that will let you
>>>> transmit 9200 byte packets. Conversely, if the native L2 is 9000 bytes,
>>>> then the MTU in a Neutron virtual network is less than 9000 - so what MTU
>>>> do you want to offer your applications? If your apps don't care, why not
>>>> tell them what MTU they're getting (e.g. 1450) and be done with it?
>>>> (Memory says that the old problem with that was that github had problems
>>>> with PMTUD in that circumstance, but I don't know if that's still true, and
>>>> even if it is it's not technically our problem.)
>>>>
>>>> Per the spec, I would like to see us do the remaining fixes to make
>>>> that work as intended - largely 'tell the VMs what they're getting' - and
>>>> then, as others have said, lay out simple options for deployments, be they
>>>> jumbo frame or otherwise.
>>>>
>>>> If you're seeing MTU related problems at this point, can you file bugs
>>>> on them and/or report back the bugs here, so that we can see what we're
>>>> actually facing?
>>>> --
>>>> Ian.
>>>>
>>>>
>>>> __________________________________________________________________________
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe:
>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>>
>>>
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>>
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160124/8034de85/attachment.html>
More information about the OpenStack-dev
mailing list