[Openstack-operators] ML2/OVS odd GRE brokenness

Jonathan D. Proulx jon at csail.mit.edu
Wed Nov 9 14:48:55 UTC 2016


On Tue, Nov 08, 2016 at 06:27:34PM -0500, George Mihaiescu wrote:
:Hi Jonathan,
:
:The openvswitch-agent is out of sync on compute 4, try restarting it.

Nope.

Restarted the agent on 4 and when that failed also on 1 they still
can't talk.

I had previously migrated 4 to a new host to test that theory.

Also if I start 5 (also on new hypervisor from all others) 4 an d5 can
talk to eachother but neither 4 nor 5 can talk to 1

-Jon

:
:
:
:> On Nov 8, 2016, at 17:43, Jonathan Proulx <jon at csail.mit.edu> wrote:
:> 
:> 
:> I have an odd issue that seems to just be affecting one private
:> network for one tenant, though I saw a similar thing on a different
:> project network recently which I 'fixed' by rebooting the hypervisor.
:> Since this has now (maybe) happened twice I figure I should try to
:> understand what it is.
:> 
:> Given the following four VMs on 4 different hypervisors
:> 
:> vm1 on Hypervisor1
:> vm2 on Hypervisor2
:> vm3 on Hypervisor3
:> -------------------
:> vm4 on Hypervisor4
:> 
:> 
:> vm1 -> vm3 talk fine among themselves but none to 4
:> 
:> examining ping traffic transiting from vm1-vm4 I can see arp requests
:> and responses at vm4 and GRE encapsulated ARP responses on
:> Hypervisor1's physical interface.
:> 
:> They look the same to me (same ecap id) coming in as the working vms
:> traffic, but they never make it to the qvo device which is before
:> iptables sec_group rules are applied at the tap device.
:> 
:> attempting to tare down and recreate this resuls in the same first 3
:> work last one doesn't split (possibly becuase scheduler puts them in
:> the same place? haven't checked) 
:> 
:> ovs-vsctl -- set Bridge br-int mirrors=@m  -- --id=@snooper2 get Port snooper2  -- --id=@gre-801e0347 get Port gre-801e0347 -- --id=@m create Mirror name=mymirror select-dst-port=@gre-801e0347 select-src-port=@gre-801e0347 output-port=@snooper2
:> 
:> tcpdump -i snooper2 
:> 
:> Only sees ARP requests but no response, what's broken if I can see GRE
:> encap ARP responses on physical interface but not on gre-<hex>
:> interface?  And why is it not broken for all tunnels endpoints?
:> 
:> Oddly if I boot a 5th VM on a 5th hypervisor it can talk to 4 but not 1-3 ...
:> 
:> hypervisors are Ubuntu 14.04 running Mitaka from cloud archive w/
:> xenial-lts kernels (4.4.0)
:> 
:> -Jon
:> 
:> -- 
:> 
:> _______________________________________________
:> OpenStack-operators mailing list
:> OpenStack-operators at lists.openstack.org
:> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators



More information about the OpenStack-operators mailing list