[Openstack] Directional network performance issues with Neutron + OpenvSwitch

Darragh O'Reilly dara2002-openstack at yahoo.com
Mon Oct 28 09:00:17 UTC 2013


Thiago,

some more answers below.

Btw: I saw the problem with a "qemu-nbd -c" process using all the cpu on the compute. It happened just once - must be a bug in it. You can disable libvirt injection if you don't want it by setting "libvirt_inject_partition = -2" in nova.conf.


On Saturday, 26 October 2013, 16:58, Martinx - ジェームズ <thiagocmartinsc at gmail.com> wrote:

Hi Darragh,
>
>
>Yes, on the same net-node machine, Grizzly works, Havana don't... But, for Grizzly, I have Ubuntu 12.04 with Linux 3.2 and >OVS 1.4.0-1ubuntu1.6.


so we don't know if the problem is due to Neutron, the Ubuntu kernel or OVS. I suspect the kernel as it implements the routing/natting, interfaces and namespaces.  I don't think Neutron Havana changes how these things are setup too much.

Can you try running Havana on a network node with the Linux 3.2 kernel?


>
>
>If I replace the Havana net-node hardware entirely, the problem persist (i.e. it "follows" Havana net-node), so, I think, it can not be related to the hardware.
>
>
>I tried Havana with both OVS 1.10.2 (from Cloud Archive) and with OVS 1.11.0 (compiled and installed by myself using dpkg-buildpackage / dpkg).
>
>
>My logs (including Open vSwitch) right after starting an Instance (nothing at OVS logs):
>
>
>http://paste.openstack.org/show/49870/
>
>
>
>I tried everything, including installing the Network Node on top of a KVM virtual machine or directly on a dedicated server, same result, the problem follows Hanava node (virtual or physical). Grizzly Network Node works both on a KVM VM or on a dedicated server.
>
>
>Regards,
>Thiago
>
>
>
>On 26 October 2013 06:28, Darragh OReilly wrote:
>
>Hi Thiago,
>>
>>so just to confirm - on the same netnode machine, with the same OS, kernal and OVS versions - Grizzly is ok and Havana is not?
>>
>>Also, on the network node, are there any errors in the neutron logs, the syslog, or /var/log/openvswitch/* ?
>>
>>
>>
>>Re, Darragh.
>>
>>
>>
>>
>>On Saturday, 26 October 2013, 5:25, Martinx - ジェームズ <thiagocmartinsc at gmail.com> wrote:
>> 
>>LOL... One day, Internet via "Quantum Entanglement"! Oops, Neutron!     =P
>>>
>>>
>>>
>>>I'll ignore the problems related to the "performance between two instances on different hypervisors" for now. My priority is the connectivity issue with the External networks... At least, internal is slow but it works.
>>>
>>>
>>>I'm about to remove the L3 Agent / Namespaces entirely from my topology... It is a shame because it is pretty cool! With Grizzly I had no problems at all. Plus, I need to put Havana into production ASAP!    :-/
>>>
>>>
>>>Why I'm giving it up (of L3 / NS) for now? Because I tried:
>>>
>>>
>>>The option "tenant_network_type" with gre, vxlan and vlan (range physnet1:206:256 configured at the 3Com switch as tagged).
>>>
>>>
>>>From the instances, the connection with External network is always slow, no matter if I choose for Tenants, GRE, VXLAN or VLAN.
>>>
>>>
>>>For example, right now, I'm using VLAN, same problem.
>>>
>>>
>>>Don't you guys think that this can be a problem with the bridge "br-ex" and its internals ? Since I swapped the "Tenant Network Type" 3 times, same result... But I still did not removed the br-ex from the scene.
>>>
>>>
>>>If someone wants to debug it, I can give the root password, no problem, it is just a lab...   =)
>>>
>>>
>>>Thanks!
>>>Thiago
>>>
>>>
>>>On 25 October 2013 19:45, Rick Jones <rick.jones2 at hp.com> wrote:
>>>
>>>On 10/25/2013 02:37 PM, Martinx - ジェームズ wrote:
>>>>
>>>>WOW!! Thank you for your time Rick! Awesome answer!!    =D
>>>>>
>>>>>I'll do this tests (with ethtool GRO / CKO) tonight but, do you think
>>>>>that this is the main root of the problem?!
>>>>>
>>>>>
>>>>>I mean, I'm seeing two distinct problems here:
>>>>>
>>>>>1- Slow connectivity to the External network plus SSH lags all over the
>>>>>cloud (everything that pass trough L3 / Namespace is problematic), and;
>>>>>
>>>>>2- Communication between two Instances on different hypervisors (i.e.
>>>>>maybe it is related to this GRO / CKO thing).
>>>>>
>>>>>
>>>>>So, two different problems, right?!
>>>>>
>>>>
One or two problems I cannot say.    Certainly if one got the benefit of stateless offloads in one direction and not the other, one could see different performance limits in each direction.
>>>>
>>>>All I can really say is I liked it better when we were called Quantum, because then I could refer to it as "Spooky networking at a distance."  Sadly, describing Neutron as "Networking with no inherent charge" doesn't work as well :)
>>>>
>>>>rick jones
>>>>
>>>>
>>>
>>>
>>>
>
>
>




More information about the Openstack mailing list