[Openstack] [Neutron] asymetric DHCP brokenness on tenant GRE networks

Jonathan Proulx jon at jonproulx.com
Mon Feb 3 15:42:43 UTC 2014


Turns out my issue is simply that Havana requires Open vSwitch >=1.10
and my compute nodes were still running 1.4.  I some how managed to
miss the errors in the logs (didn't look far enough back) indicating
the failure to properly create the flooding flow on the tunnel bridge:

/var/log/neutron/openvswitch-agent.log.4.gz:Command: ['sudo',
'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ovs-ofctl',
'mod-flows', 'br-tun',
'hard_timeout=0,idle_timeout=0,priority=1,table=21,dl_vlan=1,actions=strip_vlan,set_tunnel:3,output:4,58,56,11,12,47,13,48,49,44,43,45,46,30,31,29,28,26,27,24,25,32,19,21,59,60,57,6,5,20,18,17,16,15,14,7,9,8,53,10,3,2,38,37,39,40,34,23,36,35,22,42,41,54,52,51,50,55,33']
/var/log/neutron/openvswitch-agent.log.4.gz:Stderr: 'ovs-ofctl:
unknown keyword hard_timeout\n'

v1.10 was installed on upgrade, but since it's tied to kernel module,
requires a reboot of the compute nodes to make it go.

On Fri, Jan 31, 2014 at 3:30 AM, Ruzicka, Marek
<marek.ruzicka at t-systems.sk> wrote:
> Hi Jon,
>
> By any chance, do you have any kind of asymmetric routing in place?
> This is definitely a long shot, since I have no idea about your setup, but we have experienced similar issues ourselves.
> In our case it was problem with asymmetric routing and rather dumb linux defaults when it comes to arp settings.
>
> Try to check what are your current settings, and if they differ, try these:
>
> net.ipv4.conf.all.arp_announce=1
> net.ipv4.conf.default.arp_announce=1
> net.ipv4.conf.all.arp_notify=1
> net.ipv4.conf.default.arp_notify=1
> net.ipv4.conf.all.rp_filter=0
> net.ipv4.conf.default.rp_filter=0
>
> Just shooting from the hip here, so sorry if I'm completely wrong here.
>
> Marek
>
> -----Original Message-----
> From: Jonathan Proulx [mailto:jon at jonproulx.com]
> Sent: 30. januára 2014 19:11
> To: Robert Collins
> Cc: openstack at lists.openstack.org
> Subject: Re: [Openstack] [Neutron] asymetric DHCP brokenness on tenant GRE networks
>
> Still can't quite sort this out but I am circling in on where the problem is.
>
> To recap bootpc and arp requests from instances using GRE tenant networks are not making it onto the physical network,  I suspect this is "all broadcast traffic".  If IP is configured statically and the arp cache is set (by pinging from the other end, network controller in this case) I can communicate over the link, until the arp cache times out...
>
> By fiddling with ovs port mirroring I've been able to determine where the packets disappear from my expected path (and verified that packets are visible at these point when traffic is passing):
>
>
> tap (has packets) -> patch-tun (has packets) -> patch-int (still
> there) -> gre-<N> (no packets) -> eth0 (no packets)
>
> \___________________________________/
> \_____________________________________/    (GRE wrapped)
>
>                         br-int
>                  br-tun                                          IP of
> tunnel endpoint
>
>
> That will probably get mangled by line wrapping but packets make it to the tunnel bridge, br-tun, on the patch-int interface but do not make it onto the gre-<n> interface.  This is consistent across multiple GRE networks including newly created ones.  The provider VLAN networks most of our instances use function normally (on a much different path), and GRE used to work definitely with Grizzly though not sure if they broke on upgrade or since then as they're not widely used.
>
> so my basic question remains WTF?
>
> -Jon
>
> _______________________________________________
> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to     : openstack at lists.openstack.org
> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack




More information about the Openstack mailing list