[Openstack] Directional network performance issues with Neutron + OpenvSwitch
Martinx - ジェームズ
thiagocmartinsc at gmail.com
Sun Oct 27 22:01:31 UTC 2013
I have a small report from my latest tests.
* Namespace (br-ex) *<->* Internet - OK
* Namespace (vxlan,gre,vlan) *<->* Tenant - OK
* Tenant *<->* Namespace *<->* Internet - *NOT-OK* (Very slow / Unstable /
Since the connectivity from Tenant to its Namespace is fine AND, from its
Namespace to the Internet is also fine too, then, come to my mind: Hey, why
not run Squid WITHIN the Tenant Namespace as a workaround?!
And... Voialá! There I "Fixed" It! =P
Tenant *<->* *Namespace with Squid* *<->* Internet - OK!
*NOTE:* I'm sure that the entire ethernet path (without L3, Namespace, OVS,
VXLANs, GREs, or Linux bridges, just plain Linux + IPs), *from the
hypervisor to the Internet*, *passing trough the same Network Node hardware
/ path*, is working smoothly. I mean, I tested the entire path BEFORE
installing OpenStack Havana... So, I it can not be a "infrastructure /
hardware" issue, it must be something else, located at the software layer
running within the Network Node itself.
I'm about to send more info about this problem.
On 26 October 2013 13:57, Martinx - ジェームズ <thiagocmartinsc at gmail.com> wrote:
> Hi Darragh,
> Yes, on the same net-node machine, Grizzly works, Havana don't... But, for
> Grizzly, I have Ubuntu 12.04 with Linux 3.2 and OVS 1.4.0-1ubuntu1.6.
> If I replace the Havana net-node hardware entirely, the problem persist
> (i.e. it "follows" Havana net-node), so, I think, it can not be related to
> the hardware.
> I tried Havana with both OVS 1.10.2 (from Cloud Archive) and with OVS
> 1.11.0 (compiled and installed by myself using dpkg-buildpackage / dpkg).
> My logs (including Open vSwitch) right after starting an Instance (nothing
> at OVS logs):
> I tried everything, including installing the Network Node on top of a KVM
> virtual machine or directly on a dedicated server, same result, the problem
> follows Hanava node (virtual or physical). Grizzly Network Node works both
> on a KVM VM or on a dedicated server.
> On 26 October 2013 06:28, Darragh OReilly <darragh.oreilly at yahoo.com>wrote:
>> Hi Thiago,
>> so just to confirm - on the same netnode machine, with the same OS,
>> kernal and OVS versions - Grizzly is ok and Havana is not?
>> Also, on the network node, are there any errors in the neutron logs, the
>> syslog, or /var/log/openvswitch/* ?
>> Re, Darragh.
>> On Saturday, 26 October 2013, 5:25, Martinx - ジェームズ <
>> thiagocmartinsc at gmail.com> wrote:
>> LOL... One day, Internet via "Quantum Entanglement"! Oops, Neutron! =P
>> I'll ignore the problems related to the "performance between two
>> instances on different hypervisors" for now. My priority is the
>> connectivity issue with the External networks... At least, internal is slow
>> but it works.
>> I'm about to remove the L3 Agent / Namespaces entirely from my
>> topology... It is a shame because it is pretty cool! With Grizzly I had no
>> problems at all. Plus, I need to put Havana into production ASAP! :-/
>> Why I'm giving it up (of L3 / NS) for now? Because I tried:
>> The option "tenant_network_type" with gre, vxlan and vlan (range
>> physnet1:206:256 configured at the 3Com switch as tagged).
>> From the instances, the connection with External network *is always slow*,
>> no matter if I choose for Tenants, GRE, VXLAN or VLAN.
>> For example, right now, I'm using VLAN, same problem.
>> Don't you guys think that this can be a problem with the bridge "br-ex"
>> and its internals ? Since I swapped the "Tenant Network Type" 3 times, same
>> result... But I still did not removed the br-ex from the scene.
>> If someone wants to debug it, I can give the root password, no problem,
>> it is just a lab... =)
>> On 25 October 2013 19:45, Rick Jones <rick.jones2 at hp.com> wrote:
>> On 10/25/2013 02:37 PM, Martinx - ジェームズ wrote:
>> WOW!! Thank you for your time Rick! Awesome answer!! =D
>> I'll do this tests (with ethtool GRO / CKO) tonight but, do you think
>> that this is the main root of the problem?!
>> I mean, I'm seeing two distinct problems here:
>> 1- Slow connectivity to the External network plus SSH lags all over the
>> cloud (everything that pass trough L3 / Namespace is problematic), and;
>> 2- Communication between two Instances on different hypervisors (i.e.
>> maybe it is related to this GRO / CKO thing).
>> So, two different problems, right?!
>> One or two problems I cannot say. Certainly if one got the benefit of
>> stateless offloads in one direction and not the other, one could see
>> different performance limits in each direction.
>> All I can really say is I liked it better when we were called Quantum,
>> because then I could refer to it as "Spooky networking at a distance."
>> Sadly, describing Neutron as "Networking with no inherent charge" doesn't
>> work as well :)
>> rick jones
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Openstack