[Openstack] dnsmasq and ipv6 causing high load on network nodes with many fluctuates between D and S states.

Sridhar Gaddam sgaddam at redhat.com
Wed Nov 18 15:34:06 UTC 2015



On 11/18/2015 08:28 PM, Brian Haley wrote:
> On 11/18/2015 07:33 AM, kevin parrikar wrote:
>> i am trying Juno *without *DVR and there are around 500 sub nets in 
>> each network
>> node.
>>
>> Network nodes have 500+ dnsmasq process which are configured to give 
>> ipv6,ipv4
>> addresses out of this some fluctuates between D and S state while 
>> others are in
>> continuous S state.Probably because of too many processes in D state 
>> with
>> "rtnetlink_rcv" in "wchan" its showing very high load average.
>>
>>
>> functions in stack when in D state:
>>
>> [<ffffffff816360a9>] rtnetlink_rcv+0x19/0x30
>> [<ffffffff81653e25>] netlink_unicast+0xd5/0x1b0
>> [<ffffffff8165420e>] netlink_sendmsg+0x30e/0x680
>> [<ffffffff8160e32b>] sock_sendmsg+0x8b/0xc0
>> [<ffffffff8160e871>] SYSC_sendto+0x121/0x1c0
>> [<ffffffff8160f25e>] SyS_sendto+0xe/0x10
>> [<ffffffff817342dd>] system_call_fastpath+0x1a/0x1f
>> [<ffffffffffffffff>] 0xffffffffffffffff
>>
>> but in couple of seconds it changes to S state with this in stack,all 
>> idle
>> process have this in stack.
>>
>> [<ffffffff811d20d9>] poll_schedule_timeout+0x49/0x70
>> [<ffffffff811d2ac6>] do_select+0x5b6/0x780
>> [<ffffffff811d2e5c>] core_sys_select+0x1cc/0x2e0
>> [<ffffffff811d301b>] SyS_select+0xab/0x100
>> [<ffffffff817342dd>] system_call_fastpath+0x1a/0x1f
>> [<ffffffffffffffff>] 0xffffffffffffffff
>>
>> interface in which dnsmasq is listening has " tentative dadfailed" 
>> in  "scope
>> global" for ipv6 for all the dnsmasq process that are in D state.
>>
>> i don't understand why duplicate ipv6 issue is coming when there are 
>> vlan tags
>> to separate each sub net.For ipv4 duplicate IP issue is not coming 
>> for same subnet.
>
> DAD failure is fatal for the address, as it won't be tried again 
> without bouncing the interface or tweaking the disable_ipv6 sysctl on 
> the interface. The "tentative" flag is just showing it is still in 
> that state (i.e. waiting for DAD to complete).  Figuring out why this 
> is happening will most likely solve your problem, but there isn't 
> enough here to go on.
>
> The fact that you don't see an IPv4 'duplicate' message is that it's 
> not going to typically do it, so you could have a duplicate there and 
> just not know it.
>
> I'd say to try Kilo or Liberty, but it's unclear if this is a Neutron 
> issue or something more basic with your setup.
>
Agree with Brian.
Is the problem seen even with a single (or say two) IPv6 subnets? Or is 
it seen only at scale?
> -Brian
>
>
>> Not able to understand why dnsmasq that are connected to interfaces 
>> that have
>> "tentative dadfailed " continuously changes between S and D state 
>> while others
>> are idle.
>>
>> attached is the strace while dnsmasq is in D state.
>>
>> Any idea about this issue?
>>
>>
>> _______________________________________________
>> Mailing list: 
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> Post to     : openstack at lists.openstack.org
>> Unsubscribe : 
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>
>
>
> _______________________________________________
> Mailing list: 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to     : openstack at lists.openstack.org
> Unsubscribe : 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack





More information about the Openstack mailing list