[Openstack] DHCP problem in grizzly

Édouard Thuleau thuleau at gmail.com
Wed Aug 7 14:46:19 UTC 2013


I think we have found (Sylvain and me) a problem that can explain this
trouble:

When the load is too heavy (update dnsmasq host file and send lease update)
on DHCP agent, the report state to Neutron server is delayed and the
Neutron sever considers that agent is down and doesn't sent the port
creation to the agent. So the dnsmasq host file isn't updated to serve that
IP port's.

Do you have this log in agent log file :
2013-08-07 13:21:46  WARNING [quantum.openstack.common.loopingcall] task
run outlasted interval by 2.375859 sec

You can increase the 'report_interval' flag on the agent and the
'agent_down_time' flag on the Neutron server side.
This problem should be corrected with this bp:
https://blueprints.launchpad.net/neutron/+spec/remove-dhcp-lease
Meanwhile, I think we should add log warning in the neutron server code to
prevent that it cannot notify any DHCP agent for a port creation. And
backport that on the Grizzly release.

What do you think ?

I had this comment on the bug
https://bugs.launchpad.net/neutron/+bug/1185916

Édouard.


On Fri, Aug 2, 2013 at 11:45 AM, Chu Duc Minh <chu.ducminh at gmail.com> wrote:

> After i deleted 2 instances: 10.2.1.10 & 10.2.1.12
> The Dnsmasq's hosts file is:
> fa:16:3e:01:d1:70,10-2-1-1.openstacklocal,10.2.1.1
> fa:16:3e:71:6a:4e,10-2-1-11.openstacklocal,10.2.1.11
> *fa:16:3e:cf:0f:c1,10-2-1-12.openstacklocal,10.2.1.12* *<-- still exist,
> problem?!*
>
> fa:16:3e:35:a1:72,10-2-1-9.openstacklocal,10.2.1.9
>
>
> BR,
>
>
> On Fri, Aug 2, 2013 at 4:27 PM, Chu Duc Minh <chu.ducminh at gmail.com>wrote:
>
>> Hi, i have the same problem when create -> terminate -> create instances.
>> This problem only occur when the new instances have the same IP as
>> deleted instances.
>>
>> I check the dnsmasq's host file
>> /var/lib/quantum/dhcp/dbc59888-e2be-4b31-b579-0a4575159bb1/host,
>> sometimes it's not update.
>>
>> I think this problem maybe not only related to Dnsmasq, it may related to
>> firewall rules (generated by Quantum) on compute-node too. Because i see
>> some dropped DHCP packet:
>> Aug  2 14:08:11 thor-compute-03 kernel: [95971.005423] IN=qbr23c67719-14
>> OUT=qbr23c67719-14 PHYSIN=qvb23c67719-14 PHYSOUT=tap23c67719-
>> 14 MAC=ff:ff:ff:ff:ff:ff:fa:16:3e:34:72:05:08:00 SRC=0.0.0.0
>> DST=255.255.255.255 LEN=328 TOS=0x10 PREC=0x00 TTL=128 ID=0 *PROTO=UDP
>> SPT=68 DPT=67* LEN=308
>> (DHCP Discovery packet?)
>> It dropped in chain quantum-openvswi-sg-fallback, then instance can't get
>> IP. Although in Dashboard i see instance got IP.
>>
>> I tried many times, and got a strange case: duplicate IP in Dnsmasq's
>> host file:
>> fa:16:3e:01:d1:70,10-2-1-1.openstacklocal,10.2.1.1
>> fa:16:3e:71:6a:4e,10-2-1-11.openstacklocal,10.2.1.11
>> *fa:16:3e:78:b5:2f,10-2-1-10.openstacklocal,10.2.1.10*
>> fa:16:3e:35:a1:72,10-2-1-9.openstacklocal,10.2.1.9
>> fa:16:3e:cf:0f:c1,10-2-1-12.openstacklocal,10.2.1.12
>> *fa:16:3e:c7:ea:0c,10-2-1-10.openstacklocal,10.2.1.10*
>>
>> My newest instance is *10.2.1.10*, and I can't ping it. In boot log of
>> this instance, i found:
>>
>> cloudinitnonet waiting 120 seconds for a network device.
>> cloudinitnonet gave up waiting for a network device.
>> ciinfo: lo    : 1 127.0.0.1       255.0.0.0       .
>> ciinfo: eth0  : 1 .               .               fa:16:3e:c7:ea:0c
>> route_info failed
>>
>> Restart instance didn't make it work, but restart quantum-dhcp-agent on
>> Quantum-node make it work.
>> After restart, content of Dnsmasq's host file is:
>> fa:16:3e:01:d1:70,10-2-1-1.openstacklocal,10.2.1.1
>> fa:16:3e:71:6a:4e,10-2-1-11.openstacklocal,10.2.1.11
>> fa:16:3e:cf:0f:c1,10-2-1-12.openstacklocal,10.2.1.12
>> fa:16:3e:35:a1:72,10-2-1-9.openstacklocal,10.2.1.9
>> *fa:16:3e:c7:ea:0c,10-2-1-10.openstacklocal,10.2.1.10*
>>
>> I think it a serious problem, hope someone could fix it soon.. :)
>>
>> Best Regards,
>>
>>
>> On Tue, Jul 2, 2013 at 8:01 PM, James Page <james.page at ubuntu.com> wrote:
>>
>>> On 20/05/13 07:51, Heinonen, Johanna (NSN - FI/Espoo) wrote:
>>>
>>>> Hi,
>>>> I have installed grizzly with quantum and ovs-plugin. It seems that
>>>> grizzly allocates the third address of each subnet for dhcp. (In folsom
>>>> it was the second address). This means that the VMs will get addresses
>>>>
>>>
>>> This sound alot like https://bugs.launchpad.net/**
>>> ubuntu/+source/quantum/+bug/**1189909<https://bugs.launchpad.net/ubuntu/+source/quantum/+bug/1189909>;
>>> I'll raise a task for dnsmasq as well.
>>>
>>> Cheers
>>>
>>> James
>>>
>>> --
>>> James Page
>>> Ubuntu Core Developer
>>> Debian Maintainer
>>> james.page at ubuntu.com
>>>
>>>
>>> ______________________________**_________________
>>> Mailing list: https://launchpad.net/~**openstack<https://launchpad.net/~openstack>
>>> Post to     : openstack at lists.launchpad.net
>>> Unsubscribe : https://launchpad.net/~**openstack<https://launchpad.net/~openstack>
>>> More help   : https://help.launchpad.net/**ListHelp<https://help.launchpad.net/ListHelp>
>>>
>>
>>
>
> _______________________________________________
> Mailing list:
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to     : openstack at lists.openstack.org
> Unsubscribe :
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20130807/b44158bb/attachment.html>


More information about the Openstack mailing list