[Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

Kevin Benton kevin at benton.pub
Wed May 31 17:15:17 UTC 2017


No prob. Thanks for replying.

On May 31, 2017 10:11 AM, "Gustavo Randich" <gustavo.randich at gmail.com>
wrote:

> Hi Kevin, I confirm that applying the patch the problem is fixed.
>
> Sorry for the inconvenience.
>
>
> On Tue, May 30, 2017 at 9:36 PM, Kevin Benton <kevin at benton.pub> wrote:
>
>> Do you have that patch already in your environment? If not, can you
>> confirm it fixes the issue?
>>
>> On Tue, May 30, 2017 at 9:49 AM, Gustavo Randich <
>> gustavo.randich at gmail.com> wrote:
>>
>>> While dumping OVS flows as you suggested, we finally found the cause of
>>> the problem: our br-ex OVS bridge lacked the secure fail mode configuration.
>>>
>>> May be the issue is related to this: https://bugs.launchpad.net/neu
>>> tron/+bug/1607787
>>>
>>> Thank you
>>>
>>>
>>> On Fri, May 26, 2017 at 6:03 AM, Kevin Benton <kevin at benton.pub> wrote:
>>>
>>>> Sorry about the long delay.
>>>>
>>>> Can you dump the OVS flows before and after the outage? This will let
>>>> us know if the flows Neutron setup are getting wiped out.
>>>>
>>>> On Tue, May 2, 2017 at 12:26 PM, Gustavo Randich <
>>>> gustavo.randich at gmail.com> wrote:
>>>>
>>>>> Hi Kevin, here is some information aout this issue:
>>>>>
>>>>> - if the network outage lasts less than ~1 minute, then connectivity
>>>>> to host and instances is automatically restored without problem
>>>>>
>>>>> - otherwise:
>>>>>
>>>>> - upon outage, "ovs-vsctl show" reports "is_connected: true" in all
>>>>> bridges (br-ex / br-int / br-tun)
>>>>>
>>>>> - after about ~1 minute, "ovs-vsctl show" ceases to show
>>>>> "is_connected: true" on every bridge
>>>>>
>>>>> - upon restoring physical interface (fix outage)
>>>>>
>>>>>         - "ovs-vsctl show" now reports "is_connected: true" in all
>>>>> bridges (br-ex / br-int / br-tun)
>>>>>
>>>>>        - access to host and VMs is NOT restored, although some pings
>>>>> are sporadically answered by host (~1 out of 20)
>>>>>
>>>>>
>>>>> - to restore connectivity, we:
>>>>>
>>>>>
>>>>>       - execute "ifdown br-ex; ifup br-ex" -> access to host is
>>>>> restored, but not to VMs
>>>>>
>>>>>
>>>>>       - restart neutron-openvswitch-agent -> access to VMs is restored
>>>>>
>>>>> Thank you!
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Apr 28, 2017 at 5:07 PM, Kevin Benton <kevin at benton.pub>
>>>>> wrote:
>>>>>
>>>>>> With the network down, does ovs-vsctl show that it is connected to
>>>>>> the controller?
>>>>>>
>>>>>> On Fri, Apr 28, 2017 at 2:21 PM, Gustavo Randich <
>>>>>> gustavo.randich at gmail.com> wrote:
>>>>>>
>>>>>>> Exactly, we access via a tagged interface, which is part of br-ex
>>>>>>>
>>>>>>> # ip a show vlan171
>>>>>>> 16: vlan171: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc
>>>>>>> noqueue state UNKNOWN group default qlen 1
>>>>>>>     link/ether 8e:14:8d:c1:1a:5f brd ff:ff:ff:ff:ff:ff
>>>>>>>     inet 10.171.1.240/20 brd 10.171.15.255 scope global vlan171
>>>>>>>        valid_lft forever preferred_lft forever
>>>>>>>     inet6 fe80::8c14:8dff:fec1:1a5f/64 scope link
>>>>>>>        valid_lft forever preferred_lft forever
>>>>>>>
>>>>>>> # ovs-vsctl show
>>>>>>>     ...
>>>>>>>     Bridge br-ex
>>>>>>>         Controller "tcp:127.0.0.1:6633"
>>>>>>>             is_connected: true
>>>>>>>         Port "vlan171"
>>>>>>>             tag: 171
>>>>>>>             Interface "vlan171"
>>>>>>>                 type: internal
>>>>>>>     ...
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Apr 28, 2017 at 3:03 PM, Kevin Benton <kevin at benton.pub>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Ok, that's likely not the issue then. I assume the way you access
>>>>>>>> each host is via an IP assigned to an OVS bridge or an interface that
>>>>>>>> somehow depends on OVS?
>>>>>>>>
>>>>>>>> On Apr 28, 2017 12:04, "Gustavo Randich" <gustavo.randich at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Kevin, we are using the default listen address of loopback
>>>>>>>>> interface:
>>>>>>>>>
>>>>>>>>> # grep -r of_listen_address /etc/neutron
>>>>>>>>> /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address
>>>>>>>>> = 127.0.0.1
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>         tcp/127.0.0.1:6640 -> ovsdb-server
>>>>>>>>> /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info
>>>>>>>>> --remote=punix:/var/run/openvswitch/db.sock
>>>>>>>>> --private-key=db:Open_vSwitch,SSL,private_key
>>>>>>>>> --certificate=db:Open_vSwitch,SSL,certificate
>>>>>>>>> --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir
>>>>>>>>> --log-file=/var/log/openvswitch/ovsdb-server.log
>>>>>>>>> --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton <kevin at benton.pub>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Are you using an of_listen_address value of an interface being
>>>>>>>>>> brought down?
>>>>>>>>>>
>>>>>>>>>> On Apr 25, 2017 17:34, "Gustavo Randich" <
>>>>>>>>>> gustavo.randich at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN /
>>>>>>>>>>> l2_population)
>>>>>>>>>>>
>>>>>>>>>>> This sounds very strange (to me): recently, after a switch
>>>>>>>>>>> outage, we lost connectivity to all our Mitaka hosts. We had to enter via
>>>>>>>>>>> iLO host by host and restart networking service to regain access. Then
>>>>>>>>>>> restart neutron-openvswitch-agent to regain access to VMs.
>>>>>>>>>>>
>>>>>>>>>>> At first glance we thought it was a problem with the NIC linux
>>>>>>>>>>> driver of the hosts not detecting link state correctly.
>>>>>>>>>>>
>>>>>>>>>>> Then we reproduced the issue simply bringing down physical
>>>>>>>>>>> interfaces for around 5 minutes, then up again. Same issue.
>>>>>>>>>>>
>>>>>>>>>>> And then.... we found that if instead of using native (ryu)
>>>>>>>>>>> OpenFlow interface in Neutron Openvswitch we used ovs-ofctl, the problem
>>>>>>>>>>> disappears.
>>>>>>>>>>>
>>>>>>>>>>> Any clue?
>>>>>>>>>>>
>>>>>>>>>>> Thanks in advance.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Mailing list: http://lists.openstack.org/cgi
>>>>>>>>>>> -bin/mailman/listinfo/openstack
>>>>>>>>>>> Post to     : openstack at lists.openstack.org
>>>>>>>>>>> Unsubscribe : http://lists.openstack.org/cgi
>>>>>>>>>>> -bin/mailman/listinfo/openstack
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20170531/5097d2bd/attachment.html>


More information about the OpenStack-operators mailing list