[Openstack] dnsmasq and ipv6 causing high load on network nodes with many fluctuates between D and S states.
Sridhar Gaddam
sgaddam at redhat.com
Wed Nov 18 15:34:06 UTC 2015
On 11/18/2015 08:28 PM, Brian Haley wrote:
> On 11/18/2015 07:33 AM, kevin parrikar wrote:
>> i am trying Juno *without *DVR and there are around 500 sub nets in
>> each network
>> node.
>>
>> Network nodes have 500+ dnsmasq process which are configured to give
>> ipv6,ipv4
>> addresses out of this some fluctuates between D and S state while
>> others are in
>> continuous S state.Probably because of too many processes in D state
>> with
>> "rtnetlink_rcv" in "wchan" its showing very high load average.
>>
>>
>> functions in stack when in D state:
>>
>> [<ffffffff816360a9>] rtnetlink_rcv+0x19/0x30
>> [<ffffffff81653e25>] netlink_unicast+0xd5/0x1b0
>> [<ffffffff8165420e>] netlink_sendmsg+0x30e/0x680
>> [<ffffffff8160e32b>] sock_sendmsg+0x8b/0xc0
>> [<ffffffff8160e871>] SYSC_sendto+0x121/0x1c0
>> [<ffffffff8160f25e>] SyS_sendto+0xe/0x10
>> [<ffffffff817342dd>] system_call_fastpath+0x1a/0x1f
>> [<ffffffffffffffff>] 0xffffffffffffffff
>>
>> but in couple of seconds it changes to S state with this in stack,all
>> idle
>> process have this in stack.
>>
>> [<ffffffff811d20d9>] poll_schedule_timeout+0x49/0x70
>> [<ffffffff811d2ac6>] do_select+0x5b6/0x780
>> [<ffffffff811d2e5c>] core_sys_select+0x1cc/0x2e0
>> [<ffffffff811d301b>] SyS_select+0xab/0x100
>> [<ffffffff817342dd>] system_call_fastpath+0x1a/0x1f
>> [<ffffffffffffffff>] 0xffffffffffffffff
>>
>> interface in which dnsmasq is listening has " tentative dadfailed"
>> in "scope
>> global" for ipv6 for all the dnsmasq process that are in D state.
>>
>> i don't understand why duplicate ipv6 issue is coming when there are
>> vlan tags
>> to separate each sub net.For ipv4 duplicate IP issue is not coming
>> for same subnet.
>
> DAD failure is fatal for the address, as it won't be tried again
> without bouncing the interface or tweaking the disable_ipv6 sysctl on
> the interface. The "tentative" flag is just showing it is still in
> that state (i.e. waiting for DAD to complete). Figuring out why this
> is happening will most likely solve your problem, but there isn't
> enough here to go on.
>
> The fact that you don't see an IPv4 'duplicate' message is that it's
> not going to typically do it, so you could have a duplicate there and
> just not know it.
>
> I'd say to try Kilo or Liberty, but it's unclear if this is a Neutron
> issue or something more basic with your setup.
>
Agree with Brian.
Is the problem seen even with a single (or say two) IPv6 subnets? Or is
it seen only at scale?
> -Brian
>
>
>> Not able to understand why dnsmasq that are connected to interfaces
>> that have
>> "tentative dadfailed " continuously changes between S and D state
>> while others
>> are idle.
>>
>> attached is the strace while dnsmasq is in D state.
>>
>> Any idea about this issue?
>>
>>
>> _______________________________________________
>> Mailing list:
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> Post to : openstack at lists.openstack.org
>> Unsubscribe :
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>
>
>
> _______________________________________________
> Mailing list:
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to : openstack at lists.openstack.org
> Unsubscribe :
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
More information about the Openstack
mailing list