[Openstack] Directional network performance issues with Neutron + OpenvSwitch

Darragh O'Reilly dara2002-openstack at yahoo.com
Fri Oct 25 16:11:38 UTC 2013



the uneven ssh performance is strange - maybe learning on the tunnel mesh is not stablizing. It is easy to mess it up by giving a wrong local_ip in the ovs-plugin config file. Check the tunnels ports on br-tun with 'ovs-vsctl show'. Is each one using the correct IPs? Br-tun should have N-1 gre-x ports - no more! Maybe you can put 'ovs-vsctl show' from the nodes on paste.openstack if there are not to many?

Re, Darragh.




On Friday, 25 October 2013, 16:20, Martinx - ジェームズ <thiagocmartinsc at gmail.com> wrote:
 
I think can say... "YAY!!"    :-D
>
>
>With "LibvirtOpenVswitchDriver" my internal communication is the double now! It goes from ~200 (with LibvirtHybridOVSBridgeDriver) to 400Mbit/s (with LibvirtOpenVswitchDriver)! Still far from 1Gbit/s (my physical path limit) but, more acceptable now.
>
>
>The command "ethtool -K eth1 gro off" still makes no difference.
>
>
>So, there is only 1 remain problem, when traffic pass trough L3 / Namespace, it is still useless. Even the SSH connection into my Instances, via its Floating IPs, is slow as hell, sometimes it just stops responding for a few seconds, and becomes online again "out-of-nothing"...
>
>
>I just detect a weird "behavior", when I run "apt-get update" from instance-1, it is slow as I said plus, its ssh connection (where I'm running apt-get update), stops responding right after I run "apt-get update" AND, all my others ssh connections also stops working too! For a few seconds... This means that when I run "apt-get update" from within instance-1, the SSH session of instance-2 is affected too!! There is something pretty bad going on at L3 / Namespace.
>
>
>BTW, do you think that a ~400MBit/sec intra-vm-communication (GRE tunnel) on top of a 1Gbit ethernet is acceptable?! It is still less than a half...
>
>
>Thank you!
>Thiago
>
>
>On 25 October 2013 12:28, Darragh O'Reilly <dara2002-openstack at yahoo.com> wrote:
>
>Hi Thiago,
>>
>>
>>for the VIF error: you will need to change qemu.conf as described here:
>>http://openvswitch.org/openstack/documentation/
>>
>>
>>Re, Darragh.
>>
>>
>>
>>
>>On Friday, 25 October 2013, 15:14, Martinx - ジェームズ <thiagocmartinsc at gmail.com> wrote:
>> 
>>Hi Darragh,
>>>
>>>
>>>Yes, Instances are getting MTU 1400.
>>>
>>>
>>>I'm using LibvirtHybridOVSBridgeDriver at my Compute Nodes. I'll check BG 1223267 right now! 
>>>
>>>
>>>
>>>
>>>The LibvirtOpenVswitchDriver doesn't work, look:
>>>
>>>
>>>http://paste.openstack.org/show/49709/
>>>
>>>
>>>
>>>http://paste.openstack.org/show/49710/
>>>
>>>
>>>
>>>
>>>
>>>My NICs are "RTL8111/8168/8411 PCI Express Gigabit Ethernet", Hypervisors motherboard are MSI-890FXA-GD70.
>>>
>>>
>>>The command "ethtool -K eth1 gro off" did not had any effect on the communication between instances on different hypervisors, still poor, around 248Mbit/sec, when its physical path reach 1Gbit/s (where GRE is built).
>>>
>>>
>>>My Linux version is "Linux hypervisor-1 3.8.0-32-generic #47~precise1-Ubuntu", same kernel on Network Node" and others nodes too (Ubuntu 12.04.3 installed from scratch for this Havana deployment).
>>>
>>>
>>>The only difference I can see right now, between my two hypervisors, is that my second is just a spare machine, with a slow CPU but, I don't think it will have a negative impact at the network throughput, since I have only 1 Instance running into it (plus a qemu-nbd process eating 90% of its CPU). I'll replace this CPU tomorrow, to redo this tests again but, I don't think that this is the source of my problem. The MOBOs of two hypervisors are identical, 1 3Com (manageable) switch connecting the two.
>>>
>>>
>>>Thanks!
>>>Thiago
>>>
>>>
>>>
>>>On 25 October 2013 07:15, Darragh O'Reilly <dara2002-openstack at yahoo.com> wrote:
>>>
>>>Hi Thiago,
>>>>
>>>>you have configured DHCP to push out a MTU of 1400. Can you confirm that the 1400 MTU is actually getting out to the instances by running 'ip link' on them?
>>>>
>>>>There is an open problem where the veth used to connect the OVS and Linux bridges causes a performance drop on some kernels - https://bugs.launchpad.net/nova-project/+bug/1223267 .  If you are using the LibvirtHybridOVSBridgeDriver VIF driver, can you try changing to LibvirtOpenVswitchDriver and repeat the iperf test between instances on different compute-nodes.
>>>>
>>>>What NICs (maker+model) are you using? You could try disabling any off-load functionality - 'ethtool -k <iface-used-for-gre>'.
>>>>
>>>>What kernal are you using: 'uname -a'?
>>>>
>>>>Re, Darragh.
>>>>
>>>>
>>>>> Hi Daniel,
>>>>
>>>>>
>>>>> I followed that page, my Instances MTU is lowered by DHCP Agent but, same
>>>>> result: poor network performance (internal between Instances and when
>>>>> trying to reach the Internet).
>>>>>
>>>>> No matter if I use "dnsmasq_config_file=/etc/neutron/dnsmasq-neutron.conf +
>>>>> "dhcp-option-force=26,1400"" for my Neutron DHCP agent, or not (i.e. MTU =
>>>>> 1500), the result is almost the same.
>>>>>
>>>>> I'll try VXLAN (or just VLANs) this weekend to see if I can get better
>>>>> results...
>>>>>
>>>>> Thanks!
>>>>> Thiago
>>>>
>>>>
>>>>_______________________________________________
>>>>Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>>>Post to     : openstack at lists.openstack.org
>>>>Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>>>
>>>
>>>
>>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20131025/3e0c1e78/attachment.html>


More information about the Openstack mailing list