[Openstack-operators] Mitaka to Newton networking issues

Kevin Benton kevin at benton.pub
Tue Dec 6 14:07:45 UTC 2016


There was a bug that the fixes just recently merged for where removing a
router on the L3 agent was done in the wrong order and it resulted in
issues cleaning up the interfaces with Linux Bridge + L3HA.
https://bugs.launchpad.net/neutron/+bug/1629159

It could be the case that there is an orphaned veth pair in a deleted
namespace from the same router when it was removed from the L3 agent.

For each L3 agent, can you shutdown the L3 agent, run the netns cleanup
script, ensure all keepalived processes are dead, and then start the agent
again?

On Tue, Dec 6, 2016 at 4:59 AM, Grant Morley <grant at absolutedevops.io>
wrote:

> They both appear to be "ACTIVE" which is what I would expect:
>
> root at management-1-utility-container-f1222d05:~# neutron port-show
> 8cd027f1-9f8c-4077-9c8a-92abc62fadd4
> +-----------------------+-----------------------------------
> ---------------------------------------------------+
> | Field                 | Value
>                                                        |
> +-----------------------+-----------------------------------
> ---------------------------------------------------+
> | admin_state_up        | True
>                                                        |
> | allowed_address_pairs |
>                                                          |
> | binding:host_id       | network-1-neutron-agents-
> container-11d47568                                          |
> | binding:profile       | {}
>                                                        |
> | binding:vif_details   | {"port_filter": true}
>                                        |
> | binding:vif_type      | bridge
>                                                        |
> | binding:vnic_type     | normal
>                                                        |
> | created_at            | 2016-12-05T10:58:01Z
>                                                        |
> | description           |
>                                                          |
> | device_id             | a8a10308-d62f-420f-99cf-
> f3727ef2b784                                                 |
> | device_owner          | network:router_ha_interface
>                                                        |
> | extra_dhcp_opts       |
>                                                          |
> | fixed_ips             | {"subnet_id": "6495d542-4b78-40df-84af-31500aaa0bf8",
> "ip_address": "169.254.192.5"} |
> | id                    | 8cd027f1-9f8c-4077-9c8a-
> 92abc62fadd4                                                 |
> | mac_address           | fa:16:3e:58:a1:a4
>                                                        |
> | name                  | HA port tenant e0ffdeb1e910469d9e625b95f2fa6c
> 54                                      |
> | network_id            | 2b04fc3a-5c0d-4f55-996f-
> 8888d8bd1e1d                                                 |
> | port_security_enabled | False
>                                                        |
> | project_id            |
>                                                          |
> | revision_number       | 23
>                                                        |
> | security_groups       |
>                                                          |
> | status                | ACTIVE
>                                                        |
> | tenant_id             |
>                                                          |
> | updated_at            | 2016-12-06T10:18:00Z
>                                                        |
> +-----------------------+-----------------------------------
> ---------------------------------------------------+
> root at management-1-utility-container-f1222d05:~# neutron port-show
> bda1f324-3178-46e5-8638-0f454ba09cab
> +-----------------------+-----------------------------------
> ---------------------------------------------------+
> | Field                 | Value
>                                                        |
> +-----------------------+-----------------------------------
> ---------------------------------------------------+
> | admin_state_up        | True
>                                                        |
> | allowed_address_pairs |
>                                                          |
> | binding:host_id       | network-2-neutron-agents-
> container-40906bfc                                          |
> | binding:profile       | {}
>                                                        |
> | binding:vif_details   | {"port_filter": true}
>                                        |
> | binding:vif_type      | bridge
>                                                        |
> | binding:vnic_type     | normal
>                                                        |
> | created_at            | 2016-12-05T10:58:01Z
>                                                        |
> | description           |
>                                                          |
> | device_id             | a8a10308-d62f-420f-99cf-
> f3727ef2b784                                                 |
> | device_owner          | network:router_ha_interface
>                                                        |
> | extra_dhcp_opts       |
>                                                          |
> | fixed_ips             | {"subnet_id": "6495d542-4b78-40df-84af-31500aaa0bf8",
> "ip_address": "169.254.192.1"} |
> | id                    | bda1f324-3178-46e5-8638-
> 0f454ba09cab                                                 |
> | mac_address           | fa:16:3e:c3:8a:14
>                                                        |
> | name                  | HA port tenant e0ffdeb1e910469d9e625b95f2fa6c
> 54                                      |
> | network_id            | 2b04fc3a-5c0d-4f55-996f-
> 8888d8bd1e1d                                                 |
> | port_security_enabled | False
>                                                        |
> | project_id            |
>                                                          |
> | revision_number       | 15
>                                                        |
> | security_groups       |
>                                                          |
> | status                | ACTIVE
>                                                        |
> | tenant_id             |
>                                                          |
> | updated_at            | 2016-12-05T14:35:16Z
>                                                        |
> +-----------------------+-----------------------------------
> ---------------------------------------------------+
>
>
>
> On 06/12/16 12:53, Kevin Benton wrote:
>
> Can you do a 'neutron port-show' for both of those HA ports to check their
> status field?
>
> On Tue, Dec 6, 2016 at 2:29 AM, Grant Morley <grant at absolutedevops.io>
> wrote:
>
>> Hi Kevin & Neil,
>>
>> Many thanks for the reply. I have attached a screen shot showing that we
>> cannot ping between the L3 HA nodes on the router name spaces. This was
>> previously working fine with Mitaka, and has only stopped working since the
>> upgrade to Newton.
>>
>> From the packet captures and TCP dumps, the traffic doesn't seem to be
>> even leaving the namespace.
>>
>> On the attachment, the left hand side shows the state of keepalived
>> showing both HA agents as master and the ring hand side is the ping attempt.
>>
>> Regards,
>> On 06/12/16 10:14, Kevin Benton wrote:
>>
>> Yes, that is a misleading warning. What is happening is that it's trying
>> to load the interface driver as an alias first, which results in a
>> stevedore warning that you see and then it falls back to loading it by the
>> class path, which is what you have configured. We will need to see if there
>> is a way we can suppress that warning somehow when we make the call to load
>> by an alias and it fails.
>>
>> If you switch your interface to just 'linuxbridge', that should get rid
>> of the warning.
>>
>>
>> For both L3 HA nodes becoming master, we need a little more info to
>> figure out the root cause. Can you try switching into the router namespace
>> on one of the L3 HA nodes and see if you can ping the other router instance
>> across the L3 HA network for that router?
>>
>> On Mon, Dec 5, 2016 at 7:54 AM, Neil Jerram < <neil at tigera.io>
>> neil at tigera.io> wrote:
>>
>>> I have also recently been seeing 'Could not load
>>> <whatever>InterfaceDriver' warnings from the DHCP agent, and haven't yet
>>> understood that - although I'm pretty sure that my interface driver is
>>> being loaded really - or else none of my networking function would work at
>>> all.
>>>
>>> So it's possible that that part of your report is benign, and just a
>>> misleading warning.  That said, I am still worried about it too, and would
>>> like to understand it properly.
>>>
>>> I'm not aware of seeing the other symptoms you mentioned.
>>>
>>>      Neil
>>>
>>>
>>> On Mon, Dec 5, 2016 at 3:14 PM Grant Morley < <grant at absolutedevops.io>
>>> grant at absolutedevops.io> wrote:
>>>
>>>> Hi All,
>>>>
>>>> We have just upgraded from Mitaka to Newton. We are running OSA and we
>>>> seem to have come across some weird networking issues since the upgrade.
>>>> Basically network access to instances is very intermittent and seems to
>>>> randomly stop working.
>>>>
>>>> We are running neutron in HA and it appears that both of the neutron
>>>> nodes are now trying to be master and are both trying to bring up the
>>>> gateway IP addresses which would be causing conflicts.
>>>>
>>>> We are also seeing a lot of the following in the "neutron-dhcp-agent"
>>>> log files:
>>>>
>>>> 2016-12-05 14:42:24.837 2020 WARNING stevedore.named
>>>> [req-1955d0a1-1453-4c65-a93a-54e8ea39b230
>>>> 1ac995c0729142289f7237222f335806 3cc95dbe91c84e3e8ebbb9893ee54d20 - -
>>>> -] Could not load neutron.agent.linux.interface.BridgeInterfaceDriver
>>>> 2016-12-05 14:42:42.803 2020 INFO neutron.agent.dhcp.agent
>>>> [req-fad7d2bb-9d3c-4192-868a-0164b382aecf
>>>> 1ac995c0729142289f7237222f335806 3cc95dbe91c84e3e8ebbb9893ee54d20 - -
>>>> -] Trigger reload_allocations for port admin_state_up=True,
>>>> allowed_address_pairs=[], binding:host_id=, binding:profile=,
>>>> binding:vif_details=, binding:vif_type=unbound, binding:vnic_type=normal,
>>>> created_at=2016-12-05T14:42:42Z, description=,
>>>> device_id=8752effa-2ff2-4ce1-be70-e9f2243612cb,
>>>> device_owner=network:floatingip, extra_dhcp_opts=[],
>>>> fixed_ips=[{u'subnet_id': u'4ca7db2d-544a-4a97-b5a4-3cbf2467a4b7',
>>>> u'ip_address': u'XXX.XXX.XXX.XXX'}], id=b3cf223d-8e76-484a-a649-d8a7dd435124,
>>>> mac_address=fa:16:3e:ff:0d:50, name=, network_id=af5db886-0178-4f8d-9189-f55f773b37fa,
>>>> port_security_enabled=False, project_id=, revision_number=4,
>>>> security_groups=[], status=N/A, tenant_id=, updated_at=2016-12-05T14:42:42
>>>> Z
>>>>
>>>> I am a bit concerned about neutron not being able to load the Bridge
>>>> interface driver.
>>>>
>>>> Has anyone else come across this at all or have any pointers? This was
>>>> working fine in Mitaka it just seems since the upgrade to Newton, we have
>>>> these issues.
>>>>
>>>> I am able to provide more logs if they are needed.
>>>>
>>>> Regards,
>>>> --
>>>> Grant Morley
>>>> Cloud Lead
>>>> Absolute DevOps Ltd
>>>> Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts, SG1 2FP
>>>> <http://www.absolutedevops.io>www.absolutedevops.io
>>>> grant at absolutedevops.io 0845 874 0580
>>>> _______________________________________________
>>>> OpenStack-operators mailing list
>>>> OpenStack-operators at lists.openstack.org
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>>
>>>
>>> _______________________________________________
>>> OpenStack-operators mailing list
>>> OpenStack-operators at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>
>>>
>>
>> --
>> Grant Morley
>> Cloud Lead
>> Absolute DevOps Ltd
>> Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts, SG1 2FP
>> www.absolutedevops.io <grant at absolutedevops.io>grant at absolutedevops.io 0845
>> 874 0580
>>
>
>
> --
> Grant Morley
> Cloud Lead
> Absolute DevOps Ltd
> Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts, SG1 2FP
> <http://www.absolutedevops.io/>www.absolutedevops.io
> <grant at absolutedevops.i>grant at absolutedevops.io 0845 874 0580
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20161206/50c09292/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 4369 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20161206/50c09292/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ado_new.png
Type: image/png
Size: 4369 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20161206/50c09292/attachment-0003.png>


More information about the OpenStack-operators mailing list