[Openstack-operators] Intel 10 GbE / bonding issue with hash policy layer3+4

Stig Telfer stig.openstack at telfer.org
Thu Mar 24 10:10:07 UTC 2016


Hi Sascha -

I had a similar experience earlier this week.  I was testing bond performance on a dual-port Mellanox NIC, LACP-bonded with layer 3+4 transmit hash.  First run of iperf (8 streams), I saw a reasonable distribution across the links.  Shortly after that, performance dropped to a level that suggested no distribution.  I didn’t look into it any further at the time.

This was on CoreOS beta (kernel 4.3-6 IIRC).

If our experiences have a common root cause, I have a different distro and NIC to you, which might eliminate those components.  I should have the kit back in this configuration within a week, I’ll try to probe and report back.

Best wishes,
Stig


> On 23 Mar 2016, at 15:45, MailingLists - EWS <mailinglists at expresswebsystems.com> wrote:
> 
> Sascha,
> 
> What version of the ixgbe driver are you using? Is it the same on both
> kernels? Have you tried the latest "out of tree driver" from E1000 to see if
> the issue goes away?
> 
> I follow the E1000 mailing list and I seem to recall some rather recent
> posts regarding bonding and the ixgbe along with some patches being applied
> to the driver, however I don't know what version of kernel these issues were
> on, or even if the patches were accepted.
> 
> https://sourceforge.net/p/e1000/mailman/e1000-devel/thread/87618083B2453E4A8
> 714035B62D67992504FB5FF%40FMSMSX105.amr.corp.intel.com/#msg34727125
> 
> Something about a timing issue with detecting the slave's link speed and
> passing that information to the bonding driver in a timely fashion.
> 
> Tom Walsh
> ExpressHosting
> https://expresshosting.net/
> 
>> -----Original Message-----
>> From: Sascha Vogt [mailto:sascha.vogt at gmail.com]
>> Sent: Wednesday, March 23, 2016 5:54 AM
>> To: openstack-operators
>> Subject: [Openstack-operators] Intel 10 GbE / bonding issue with hash
> policy
>> layer3+4
>> 
>> Hi all,
>> 
>> I thought it might be of interest / get feedback from the operators
>> community about an oddity we experienced with Intel 10 GbE NICs and LACP
>> bonding.
>> 
>> We have Ubuntu 14.04.4 as OS and Intel 10 GbE NICs with the ixgbe Kernel
>> module. We use VLANS for ceph-client, ceph-data, openstack-data,
>> openstack-client networks all on a single LACP bonding of two 10 GbE
> ports.
>> 
>> As bonding hash policy we chose layer3+4 so we can use the full 20 Gb even
>> if only two servers communicate with each other. Typically we check that
> by
>> using iperf to a single server with -P 4 and see if we exceed the 10 Gb
> limit
>> (just a few times to check).
>> 
>> Due to Ubuntus default of installing the latest Kernel our new host had
>> Kernel 4.2.0 instead of the Kernel 3.16 the other machines had and we
>> noticed that iperf only used 10 Gb.
>> 
>>> # cat /proc/net/bonding/bond0
>>> Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
>>> 
>>> Bonding Mode: IEEE 802.3ad Dynamic link aggregation Transmit Hash
>>> Policy: layer3+4 (1)
>> 
>> This was shown on both - Kernel 3.16 and 4.2.0
>> 
>> After downgrading to Kernel 3.16 we got the iperf results we expected.
>> 
>> Does anyone have a similar setup? Anyone noticed the same things? To us
>> this looks like a bug in the Kernel (ixgbe module?), or are we
>> misunderstanding the hash policy layer3+4?
>> 
>> Any feedback is welcome :) I have not yet posted this to the Kernel ML or
>> Ubuntus ML yet, so if no one here is having a similar setup I'll move over
>> there. I just thought OpenStack ops might be the place were it is most
> likely
>> that someone has a similar setup :)
>> 
>> Greetings
>> -Sascha-
>> 
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> 
> 
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators




More information about the OpenStack-operators mailing list