[neutron] Dadfailed of ipv6 metadata IP in qdhcp namespace and disappearing dhcp namespaces

Kamil Madáč kamil.madac at slovenskoit.sk
Mon Jan 3 08:58:42 UTC 2022


Hi Brian,

thank you very much for pointing to those bugs. It is exactly what we are experiencing in our deployment. I will follow-up in those bugs then.

Kamil
________________________________
From: Brian Haley <haleyb.dev at gmail.com>
Sent: Monday, January 3, 2022 2:35 AM
To: Kamil Madáč <kamil.madac at slovenskoit.sk>; openstack-discuss <openstack-discuss at lists.openstack.org>
Subject: Re: [neutron] Dadfailed of ipv6 metadata IP in qdhcp namespace and disappearing dhcp namespaces

Hi,

On 1/2/22 10:51 AM, Kamil Madáč wrote:
> Hello,
>
> In our small cloud environment, we started to see weird behavior during
> last 2 months. Dhcp namespaces started to disappear randomly, which
> caused that VMs losed connectivity once dhcp lease expired.
> After the investigation I found out following issue/bug:
>
>  1. ipv6 metadata address of tap interface in some qdhcp-xxxx namespaces
>     are stucked in "dadfailed tentative" state (i do not know why yet)

This issue was reported about a month ago:

https://bugs.launchpad.net/neutron/+bug/1953165

And Bence marked it a duplicate of:

https://bugs.launchpad.net/neutron/+bug/1930414

Seems to be a bug in a flow based on the title - "Traffic leaked from
dhcp port before vlan tag is applied".

I would follow-up in that second bug.

Thanks,

-Brian

>  3. root at cloud01:~# ip netns exec
>     qdhcp-3094b264-829b-4381-9ca2-59b3a3fc1ea1 ip a
>     1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
>     group default qlen 1000
>          link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>          inet 127.0.0.1/8 scope host lo
>             valid_lft forever preferred_lft forever
>          inet6 ::1/128 scope host
>             valid_lft forever preferred_lft forever
>     2585: tap1797d9b1-e1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
>     qdisc noqueue state UNKNOWN group default qlen 1000
>          link/ether fa:16:3e:77:64:0d brd ff:ff:ff:ff:ff:ff
>          inet 169.254.169.254/32 brd 169.254.169.254 scope global
>     tap1797d9b1-e1
>             valid_lft forever preferred_lft forever
>          inet 192.168.0.2/24 brd 192.168.0.255 scope global tap1797d9b1-e1
>             valid_lft forever preferred_lft forever
>          inet6 fe80::a9fe:a9fe/64 scope link dadfailed tentative
>             valid_lft forever preferred_lft forever
>          inet6 fe80::f816:3eff:fe77:640d/64 scope link
>             valid_lft forever preferred_lft forever
>  4.
>
>  5. This blocked dhcp agent to finish sync_state function, and
>     NetworkCache was not updated with subnets of such neutron network
>  6. During creation of VM assigned to such network, agent does not
>     detect any subnets (see point 2), so he thinks
>     (reload_allocations()) there is no dhcp needed and deletes
>     qdhcp-xxxx namespace, so no DHCP and no Metadata are working on such
>     network since that moment, and after 24h we see connectivity issues.
>  7. Restart of DHCP agent recreates missing qdhcp-xxxx namespaces, but
>     NetworkCache  in dhcp agent is again empty, so creation of VM
>     deletes the qdhcp-xxxx namespace again 🙁
>
> Workaround is to remove dhcp agent from that network and add it again.
> Interestingly, sometimes I need to do it multiple times, because in few
> cases tap interface in new qdhcp finishes again in dadfailed tentative
> state. After year in production we have 20 networks out of 60 in such state.
>
> We are using kolla-ansible deployment on Ubuntu 20.04, kernel
> 5.4.0-65-generic. Openstack version Victoria and neutron is in version
> 17.2.2.dev70.
>
>
> Is that bug in neutron, or is it misconfiguration of OS on our side?
>
> I'm locally testing patch which disables ipv6 dad in qdhcp-xxxx
> namespace (net.ipv6.conf.default.accept_dad = 1), but I'm not sure it is
> good solution when it comes to other neutron features?
>
>
> Kamil Madáč
> /Slovensko IT a.s./
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20220103/46db7ffc/attachment-0001.htm>


More information about the openstack-discuss mailing list