[Openstack-operators] Mitaka to Newton networking issues
Grant Morley
grant at absolutedevops.io
Thu Dec 8 09:50:39 UTC 2016
Hi Kevin,
Sorry for the late reply, We have tried doing that and we were still
seeing the same issues. I don't think the bug was quite the same as
what we were seeing.
Unfortunately we have had to roll back to Mitaka as we had a tight
deadline and not being able to create networks / have HA was fairly
critical. Interestingly, now we are back on Mitaka, everything is
working fine.
I will try and get a testing environment set up to see if I get the same
results as we were seeing when we upgraded to Newton from Mitaka. I am
not sure if it is something to do with our specific set up, but we have
followed the OSA guidelines and as everything was working on Liberty and
Mitaka I assume we have it all set up correctly.
I will keep you posted to our findings, as we may be onto another bug.
Regards,
On 06/12/16 14:07, Kevin Benton wrote:
> There was a bug that the fixes just recently merged for where removing
> a router on the L3 agent was done in the wrong order and it resulted
> in issues cleaning up the interfaces with Linux Bridge + L3HA.
> https://bugs.launchpad.net/neutron/+bug/1629159
>
> It could be the case that there is an orphaned veth pair in a deleted
> namespace from the same router when it was removed from the L3 agent.
>
> For each L3 agent, can you shutdown the L3 agent, run the netns
> cleanup script, ensure all keepalived processes are dead, and then
> start the agent again?
>
> On Tue, Dec 6, 2016 at 4:59 AM, Grant Morley <grant at absolutedevops.io
> <mailto:grant at absolutedevops.io>> wrote:
>
> They both appear to be "ACTIVE" which is what I would expect:
>
> root at management-1-utility-container-f1222d05:~# neutron port-show
> 8cd027f1-9f8c-4077-9c8a-92abc62fadd4
> +-----------------------+--------------------------------------------------------------------------------------+
> | Field | Value |
> +-----------------------+--------------------------------------------------------------------------------------+
> | admin_state_up | True |
> | allowed_address_pairs | |
> | binding:host_id |
> network-1-neutron-agents-container-11d47568 |
> | binding:profile | {} |
> | binding:vif_details | {"port_filter": true} |
> | binding:vif_type | bridge |
> | binding:vnic_type | normal |
> | created_at | 2016-12-05T10:58:01Z |
> | description | |
> | device_id | a8a10308-d62f-420f-99cf-f3727ef2b784 |
> | device_owner | network:router_ha_interface |
> | extra_dhcp_opts | |
> | fixed_ips | {"subnet_id":
> "6495d542-4b78-40df-84af-31500aaa0bf8", "ip_address":
> "169.254.192.5"} |
> | id | 8cd027f1-9f8c-4077-9c8a-92abc62fadd4 |
> | mac_address | fa:16:3e:58:a1:a4 |
> | name | HA port tenant
> e0ffdeb1e910469d9e625b95f2fa6c54 |
> | network_id | 2b04fc3a-5c0d-4f55-996f-8888d8bd1e1d |
> | port_security_enabled | False |
> | project_id | |
> | revision_number | 23 |
> | security_groups | |
> | status | ACTIVE |
> | tenant_id | |
> | updated_at | 2016-12-06T10:18:00Z |
> +-----------------------+--------------------------------------------------------------------------------------+
> root at management-1-utility-container-f1222d05:~# neutron port-show
> bda1f324-3178-46e5-8638-0f454ba09cab
> +-----------------------+--------------------------------------------------------------------------------------+
> | Field | Value |
> +-----------------------+--------------------------------------------------------------------------------------+
> | admin_state_up | True |
> | allowed_address_pairs | |
> | binding:host_id |
> network-2-neutron-agents-container-40906bfc |
> | binding:profile | {} |
> | binding:vif_details | {"port_filter": true} |
> | binding:vif_type | bridge |
> | binding:vnic_type | normal |
> | created_at | 2016-12-05T10:58:01Z |
> | description | |
> | device_id | a8a10308-d62f-420f-99cf-f3727ef2b784 |
> | device_owner | network:router_ha_interface |
> | extra_dhcp_opts | |
> | fixed_ips | {"subnet_id":
> "6495d542-4b78-40df-84af-31500aaa0bf8", "ip_address":
> "169.254.192.1"} |
> | id | bda1f324-3178-46e5-8638-0f454ba09cab |
> | mac_address | fa:16:3e:c3:8a:14 |
> | name | HA port tenant
> e0ffdeb1e910469d9e625b95f2fa6c54 |
> | network_id | 2b04fc3a-5c0d-4f55-996f-8888d8bd1e1d |
> | port_security_enabled | False |
> | project_id | |
> | revision_number | 15 |
> | security_groups | |
> | status | ACTIVE |
> | tenant_id | |
> | updated_at | 2016-12-05T14:35:16Z |
> +-----------------------+--------------------------------------------------------------------------------------+
>
>
>
> On 06/12/16 12:53, Kevin Benton wrote:
>> Can you do a 'neutron port-show' for both of those HA ports to
>> check their status field?
>>
>> On Tue, Dec 6, 2016 at 2:29 AM, Grant Morley
>> <grant at absolutedevops.io <mailto:grant at absolutedevops.io>> wrote:
>>
>> Hi Kevin & Neil,
>>
>> Many thanks for the reply. I have attached a screen shot
>> showing that we cannot ping between the L3 HA nodes on the
>> router name spaces. This was previously working fine with
>> Mitaka, and has only stopped working since the upgrade to Newton.
>>
>> From the packet captures and TCP dumps, the traffic doesn't
>> seem to be even leaving the namespace.
>>
>> On the attachment, the left hand side shows the state of
>> keepalived showing both HA agents as master and the ring hand
>> side is the ping attempt.
>>
>> Regards,
>>
>> On 06/12/16 10:14, Kevin Benton wrote:
>>> Yes, that is a misleading warning. What is happening is that
>>> it's trying to load the interface driver as an alias first,
>>> which results in a stevedore warning that you see and then
>>> it falls back to loading it by the class path, which is what
>>> you have configured. We will need to see if there is a way
>>> we can suppress that warning somehow when we make the call
>>> to load by an alias and it fails.
>>>
>>> If you switch your interface to just 'linuxbridge', that
>>> should get rid of the warning.
>>>
>>>
>>> For both L3 HA nodes becoming master, we need a little more
>>> info to figure out the root cause. Can you try switching
>>> into the router namespace on one of the L3 HA nodes and see
>>> if you can ping the other router instance across the L3 HA
>>> network for that router?
>>>
>>> On Mon, Dec 5, 2016 at 7:54 AM, Neil Jerram <neil at tigera.io
>>> <mailto:neil at tigera.io>> wrote:
>>>
>>> I have also recently been seeing 'Could not load
>>> <whatever>InterfaceDriver' warnings from the DHCP agent,
>>> and haven't yet understood that - although I'm pretty
>>> sure that my interface driver is being loaded really -
>>> or else none of my networking function would work at all.
>>>
>>> So it's possible that that part of your report is
>>> benign, and just a misleading warning. That said, I am
>>> still worried about it too, and would like to understand
>>> it properly.
>>>
>>> I'm not aware of seeing the other symptoms you mentioned.
>>>
>>> Neil
>>>
>>>
>>> On Mon, Dec 5, 2016 at 3:14 PM Grant Morley
>>> <grant at absolutedevops.io
>>> <mailto:grant at absolutedevops.io>> wrote:
>>>
>>> Hi All,
>>>
>>> We have just upgraded from Mitaka to Newton. We are
>>> running OSA and we seem to have come across some
>>> weird networking issues since the upgrade. Basically
>>> network access to instances is very intermittent and
>>> seems to randomly stop working.
>>>
>>> We are running neutron in HA and it appears that
>>> both of the neutron nodes are now trying to be
>>> master and are both trying to bring up the gateway
>>> IP addresses which would be causing conflicts.
>>>
>>> We are also seeing a lot of the following in the
>>> "neutron-dhcp-agent" log files:
>>>
>>> 2016-12-05 14:42:24.837 2020 WARNING stevedore.named
>>> [req-1955d0a1-1453-4c65-a93a-54e8ea39b230
>>> 1ac995c0729142289f7237222f335806
>>> 3cc95dbe91c84e3e8ebbb9893ee54d20 - - -] Could not
>>> load neutron.agent.linux.interface.BridgeInterfaceDriver
>>> 2016-12-05 14:42:42.803 2020 INFO
>>> neutron.agent.dhcp.agent
>>> [req-fad7d2bb-9d3c-4192-868a-0164b382aecf
>>> 1ac995c0729142289f7237222f335806
>>> 3cc95dbe91c84e3e8ebbb9893ee54d20 - - -] Trigger
>>> reload_allocations for port admin_state_up=True,
>>> allowed_address_pairs=[], binding:host_id=,
>>> binding:profile=, binding:vif_details=,
>>> binding:vif_type=unbound, binding:vnic_type=normal,
>>> created_at=2016-12-05T14:42:42Z, description=,
>>> device_id=8752effa-2ff2-4ce1-be70-e9f2243612cb,
>>> device_owner=network:floatingip, extra_dhcp_opts=[],
>>> fixed_ips=[{u'subnet_id':
>>> u'4ca7db2d-544a-4a97-b5a4-3cbf2467a4b7',
>>> u'ip_address': u'XXX.XXX.XXX.XXX'}],
>>> id=b3cf223d-8e76-484a-a649-d8a7dd435124,
>>> mac_address=fa:16:3e:ff:0d:50, name=,
>>> network_id=af5db886-0178-4f8d-9189-f55f773b37fa,
>>> port_security_enabled=False, project_id=,
>>> revision_number=4, security_groups=[], status=N/A,
>>> tenant_id=, updated_at=2016-12-05T14:42:42Z
>>>
>>> I am a bit concerned about neutron not being able to
>>> load the Bridge interface driver.
>>>
>>> Has anyone else come across this at all or have any
>>> pointers? This was working fine in Mitaka it just
>>> seems since the upgrade to Newton, we have these issues.
>>>
>>> I am able to provide more logs if they are needed.
>>>
>>> Regards,
>>>
>>> --
>>> Grant Morley
>>> Cloud Lead
>>> Absolute DevOps Ltd
>>> Units H, J & K, Gateway 1000, Whittle Way,
>>> Stevenage, Herts, SG1 2FP
>>> www.absolutedevops.io <http://www.absolutedevops.io>
>>> grant at absolutedevops.io
>>> <mailto:grant at absolutedevops.io> 0845 874 0580
>>> _______________________________________________
>>> OpenStack-operators mailing list
>>> OpenStack-operators at lists.openstack.org
>>> <mailto:OpenStack-operators at lists.openstack.org>
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators>
>>>
>>>
>>> _______________________________________________
>>> OpenStack-operators mailing list
>>> OpenStack-operators at lists.openstack.org
>>> <mailto:OpenStack-operators at lists.openstack.org>
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators>
>>>
>>>
>>
>> --
>> Grant Morley
>> Cloud Lead
>> Absolute DevOps Ltd
>> Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts,
>> SG1 2FP
>> www.absolutedevops.io <http://www.absolutedevops.io>
>> grant at absolutedevops.io <mailto:grant at absolutedevops.io> 0845
>> 874 0580
>>
>>
>
> --
> Grant Morley
> Cloud Lead
> Absolute DevOps Ltd
> Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts, SG1 2FP
> www.absolutedevops.io <http://www.absolutedevops.io>
> grant at absolutedevops.io <mailto:grant at absolutedevops.io> 0845 874
> 0580
>
>
--
Grant Morley
Cloud Lead
Absolute DevOps Ltd
Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts, SG1 2FP
www.absolutedevops.io <http://www.absolutedevops.io/>
grant at absolutedevops.io <mailto:grant at absolutedevops.i> 0845 874 0580
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20161208/e47d95fc/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 4369 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20161208/e47d95fc/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 4369 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20161208/e47d95fc/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ado_new.png
Type: image/png
Size: 4369 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20161208/e47d95fc/attachment-0002.png>
More information about the OpenStack-operators
mailing list