<div dir="ltr">I think we have found (Sylvain and me) a problem that can explain this trouble:<div><br></div><div><div>When the load is too heavy (update dnsmasq host file and send lease update) on DHCP agent, the report state to Neutron server is delayed and the Neutron sever considers that agent is down and doesn't sent the port creation to the agent. So the dnsmasq host file isn't updated to serve that IP port's.</div>
<div><br></div><div>Do you have this log in agent log file :</div><div>2013-08-07 13:21:46 WARNING [quantum.openstack.common.loopingcall] task run outlasted interval by 2.375859 sec</div><div><br></div><div>You can increase the 'report_interval' flag on the agent and the 'agent_down_time' flag on the Neutron server side.</div>
<div>This problem should be corrected with this bp: <a href="https://blueprints.launchpad.net/neutron/+spec/remove-dhcp-lease" target="_blank">https://blueprints.launchpad.net/neutron/+spec/remove-dhcp-lease</a></div><div>
Meanwhile, I think we should add log warning in the neutron server code to prevent that it cannot notify any DHCP agent for a port creation. And backport that on the Grizzly release.</div>
<div><br></div><div>What do you think ?</div><div><br></div></div><div>I had this comment on the bug <a href="https://bugs.launchpad.net/neutron/+bug/1185916" target="_blank">https://bugs.launchpad.net/neutron/+bug/1185916</a></div>
<div><br></div><div>Édouard.</div>
</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Aug 2, 2013 at 11:45 AM, Chu Duc Minh <span dir="ltr"><<a href="mailto:chu.ducminh@gmail.com" target="_blank">chu.ducminh@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div>After i deleted 2 instances: 10.2.1.10 & 10.2.1.12<br></div>The Dnsmasq's hosts file is:<br>
<span style="font-family:courier new,monospace"><div class="im">fa:16:3e:01:d1:70,10-2-1-1.openstacklocal,10.2.1.1<br>
fa:16:3e:71:6a:4e,10-2-1-11.openstacklocal,10.2.1.11<br></div><b>fa:16:3e:cf:0f:c1,10-2-1-12.openstacklocal,10.2.1.12</b> <b><span style="color:rgb(255,0,0)"><span style="font-family:arial,helvetica,sans-serif"><-- still exist, problem?!</span></span></b><div class="im">
<br>
fa:16:3e:35:a1:72,10-2-1-9.openstacklocal,10.2.1.9</div></span><br><br></div>BR,<br></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Aug 2, 2013 at 4:27 PM, Chu Duc Minh <span dir="ltr"><<a href="mailto:chu.ducminh@gmail.com" target="_blank">chu.ducminh@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div>Hi, i have the same problem when create -> terminate -> create instances.<br></div>This problem only occur when the new instances have the same IP as deleted instances.<br>
</div><div><br>I check the dnsmasq's host file /var/lib/quantum/dhcp/dbc59888-e2be-4b31-b579-0a4575159bb1/host,<br>
</div><div>sometimes it's not update.<br></div><div><br>I think this problem maybe not only related to Dnsmasq, it may related to firewall rules (generated by Quantum) on compute-node too. Because i see some dropped DHCP packet:<br>
<span style="font-family:courier new,monospace">Aug 2 14:08:11 thor-compute-03 kernel: [95971.005423] IN=qbr23c67719-14 OUT=qbr23c67719-14 PHYSIN=qvb23c67719-14 PHYSOUT=tap23c67719-<br>14 MAC=ff:ff:ff:ff:ff:ff:fa:16:3e:34:72:05:08:00 SRC=0.0.0.0 DST=255.255.255.255 LEN=328 TOS=0x10 PREC=0x00 TTL=128 ID=0 <b>PROTO=UDP SPT=68 DPT=67</b> LEN=308 <br>
</span></div><div><span style="font-family:courier new,monospace">(DHCP Discovery packet?)<br></span></div><div><span style="font-family:courier new,monospace"></span></div><div>It dropped in chain quantum-openvswi-sg-fallback, then instance can't get IP. Although in Dashboard i see instance got IP.<br>
<br></div><div>I tried many times, and got a strange case: duplicate IP in Dnsmasq's host file: <br><span style="font-family:courier new,monospace">fa:16:3e:01:d1:70,10-2-1-1.openstacklocal,10.2.1.1<br>fa:16:3e:71:6a:4e,10-2-1-11.openstacklocal,10.2.1.11<br>
<b>fa:16:3e:78:b5:2f,10-2-1-10.openstacklocal,10.2.1.10</b><br>fa:16:3e:35:a1:72,10-2-1-9.openstacklocal,10.2.1.9<br>fa:16:3e:cf:0f:c1,10-2-1-12.openstacklocal,10.2.1.12<br><b>fa:16:3e:c7:ea:0c,10-2-1-10.openstacklocal,10.2.1.10</b></span><br>
<br></div><div>My newest instance is <span style="font-family:courier new,monospace"><b>10.2.1.10</b><font face="arial,helvetica,sans-serif">, and I can't ping it. In boot log of this instance, i found:<br></font></span><pre>
cloudinitnonet waiting 120 seconds for a network device.
cloudinitnonet gave up waiting for a network device.
ciinfo: lo : 1 127.0.0.1 255.0.0.0 .
ciinfo: eth0 : 1 . . fa:16:3e:c7:ea:0c
route_info failed</pre></div><div>Restart instance didn't make it work, but restart quantum-dhcp-agent on Quantum-node make it work.<br></div><div>After restart, content of Dnsmasq's host file is:<br><span style="font-family:courier new,monospace">fa:16:3e:01:d1:70,10-2-1-1.openstacklocal,10.2.1.1<br>
fa:16:3e:71:6a:4e,10-2-1-11.openstacklocal,10.2.1.11<br>fa:16:3e:cf:0f:c1,10-2-1-12.openstacklocal,10.2.1.12<br>fa:16:3e:35:a1:72,10-2-1-9.openstacklocal,10.2.1.9<br><b>fa:16:3e:c7:ea:0c,10-2-1-10.openstacklocal,10.2.1.10</b></span><br>
<br></div><div>I think it a serious problem, hope someone could fix it soon.. :)<br><br>Best Regards,<br></div></div><div><div><div class="gmail_extra"><br><br><div class="gmail_quote">On Tue, Jul 2, 2013 at 8:01 PM, James Page <span dir="ltr"><<a href="mailto:james.page@ubuntu.com" target="_blank">james.page@ubuntu.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>On 20/05/13 07:51, Heinonen, Johanna (NSN - FI/Espoo) wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi,<br>
I have installed grizzly with quantum and ovs-plugin. It seems that<br>
grizzly allocates the third address of each subnet for dhcp. (In folsom<br>
it was the second address). This means that the VMs will get addresses<br>
</blockquote>
<br></div>
This sound alot like <a href="https://bugs.launchpad.net/ubuntu/+source/quantum/+bug/1189909" target="_blank">https://bugs.launchpad.net/<u></u>ubuntu/+source/quantum/+bug/<u></u>1189909</a>; I'll raise a task for dnsmasq as well.<br>
<br>
Cheers<span><font color="#888888"><br>
<br>
James<br>
<br>
-- <br>
James Page<br>
Ubuntu Core Developer<br>
Debian Maintainer<br>
<a href="mailto:james.page@ubuntu.com" target="_blank">james.page@ubuntu.com</a></font></span><div><div><br>
<br>
______________________________<u></u>_________________<br>
Mailing list: <a href="https://launchpad.net/~openstack" target="_blank">https://launchpad.net/~<u></u>openstack</a><br>
Post to : <a href="mailto:openstack@lists.launchpad.net" target="_blank">openstack@lists.launchpad.net</a><br>
Unsubscribe : <a href="https://launchpad.net/~openstack" target="_blank">https://launchpad.net/~<u></u>openstack</a><br>
More help : <a href="https://help.launchpad.net/ListHelp" target="_blank">https://help.launchpad.net/<u></u>ListHelp</a><br>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div><br>_______________________________________________<br>
Mailing list: <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a><br>
Post to : <a href="mailto:openstack@lists.openstack.org">openstack@lists.openstack.org</a><br>
Unsubscribe : <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a><br>
<br></blockquote></div><br></div>