<div style="font-family: Helvetica; font-size: 14px;"><span style="font-size: 10pt;">Miguel Ángel Ajo</span></div><div><div><br></div></div>
<p style="color: #A0A0A8;">On Wednesday, 28 de January de 2015 at 09:50, Kevin Benton wrote:</p>
<blockquote type="cite" style="border-left-style:solid;border-width:1px;margin-left:0px;padding-left:10px;">
<span><div><div><div dir="ltr">Hi,<div><br></div><div>Approximately a year and a half ago, the default DHCP lease time in Neutron was increased from 120 seconds to 86400 seconds.[1] This was done with the goal of reducing DHCP traffic with very little discussion (based on what I can see in the review and bug report). While it it does indeed reduce DHCP traffic, I don't think any bug reports were filed showing that a 120 second lease time resulted in too much traffic or that a jump all of the way to 86400 seconds was required instead of a value in the same order of magnitude.</div><div><br></div><div>Why does this matter? </div><div><br></div><div>Neutron ports can be updated with a new IP address from the same subnet or another subnet on the same network. The port update will result in anti-spoofing iptables rule changes that immediately stop the old IP address from working on the host. This means the host is unreachable for 0-12 hours based on the current default lease time without manual intervention[2] (assuming half-lease length DHCP renewal attempts).</div><div><br></div><div>Why is this on the mailing list?</div><div><br></div><div>In an attempt to make the VMs usable in a much shorter timeframe following a Neutron port address change, I submitted a patch to reduce the default DHCP lease time to 8 minutes.[3] However, this was upsetting to several people,[4] so it was suggested I bring this discussion to the mailing list. The following are the high-level concerns followed by my responses:</div><div><ul><li>8 minutes is arbitrary</li><ul><li>Yes, but it's no more arbitrary than 1440 minutes. I picked it as an interval because it is still 4 times larger than the last short value, but it still allows VMs to regain connectivity in <5 minutes in the event their IP is changed. If someone has a good suggestion for another interval based on known dnsmasq QPS limits or some other quantitative reason, please chime in here.</li></ul><li>other datacenters use long lease times</li><ul><li>This is true, but it's not really a valid comparison. In most regular datacenters, updating a static DHCP lease has no effect on the data plane so it doesn't matter that the client doesn't react for hours/days (even with DHCP snooping enabled). However, in Neutron's case, the security groups are immediately updated so all traffic using the old address is blocked.</li></ul><li>dhcp traffic is scary because it's broadcast</li><ul><li>ARP traffic is also broadcast and many clients will expire entries every 5-10 minutes and re-ARP. L2population may be used to prevent ARP propagation, so the comparison between DHCP and ARP isn't always relevant here.</li></ul></ul></div><div><br></div></div></div></div></span></blockquote><div><span style="font-size: 14px;">For what I’ve seen, at least for linux, the first DHCP request will be broadcast. Then all lease renewals are unicast, unless, the original</span></div><div><span style="font-size: 14px;">DHCP can’t be contacted, in which case, the dhcp client will turn back to broadcast trying to find out another server to renew his lease.</span></div><div><span style="font-size: 14px;"><br></span></div><div><span style="font-size: 14px;">So, only initial boot of an instance should generate broadcast traffic.</span></div><div><br></div><div><span style="font-size: 14px;">Your proposal seems reasonable to me.</span></div><div><br></div><div><span style="font-size: 14px;">In this context, please see this ongoing work [5], specially comments here [6], where we’re discussing about optimization, </span></div><div><span style="font-size: 14px;">due to theoretical 120 second limit for </span><span style="font-size: 14px;">renews at scale, and we made some calculations of CPU usage for the current default, I </span></div><div><span style="font-size: 14px;">will recalculate those for the new proposed default: 8 </span><span style="font-size: 14px;">minutes.</span></div><div><span style="font-size: 14px;"><br></span></div><div><span style="font-size: 14px;">TL; DR. </span></div><div><span style="font-size: 14px;">That patch fixes an issue found when you restart dnsmasq, and old leases can’t be renewed, so we end up in a storm of requests,</span></div><div><span style="font-size: 14px;">for that we need to provide dnsmasq with a script for initialization of the leases table, initially such script was provided in python,</span></div><div><span style="font-size: 14px;">but that means that script is called for: init (once), lease (once per instance), and renew (every lease renew time * number of instances),</span></div><div><span style="font-size: 14px;">thus we should minimize the impact of such script as much as possible, or contribute dnsmasq to avoid such script being called</span></div><div><span style="font-size: 14px;">for lease renews under some flag.</span></div><div> </div><blockquote type="cite" style="border-left-style:solid;border-width:1px;margin-left:0px;padding-left:10px;"><span><div><div><div dir="ltr"><div></div><div>Please reply back with your opinions/anecdotes/data related to short DHCP lease times.</div><div><br></div><div>Cheers</div><div><br></div><div>1. <a href="https://github.com/openstack/neutron/commit/d9832282cf656b162c51afdefb830dacab72defe">https://github.com/openstack/neutron/commit/d9832282cf656b162c51afdefb830dacab72defe</a><br clear="all"><div>2. Manual intervention could be an instance reboot, a dhcp client invocation via the console, or a delayed invocation right before the update. (all significantly more difficult to script than a simple update of a port's IP via the API).</div><div>3. <a href="https://review.openstack.org/#/c/150595/">https://review.openstack.org/#/c/150595/</a></div><div>4. <a href="http://i.imgur.com/xtvatkP.jpg">http://i.imgur.com/xtvatkP.jpg</a></div></div></div></div></div></span></blockquote><div><span style="font-size: 14px;">5. </span><a href="https://review.openstack.org/#/c/108272/8/neutron/agent/linux/dhcp.py">https://review.openstack.org/#/c/108272/</a></div><div><span style="font-size: 14px;">6. </span><a href="https://review.openstack.org/#/c/108272/8/neutron/agent/linux/dhcp.py">https://review.openstack.org/#/c/108272/8/neutron/agent/linux/dhcp.py</a> </div><blockquote type="cite" style="border-left-style:solid;border-width:1px;margin-left:0px;padding-left:10px;"><span><div><div><div dir="ltr"><div><div><br></div>-- <br><div><div>Kevin Benton</div></div>
</div></div>
</div><div><div>__________________________________________________________________________</div><div>OpenStack Development Mailing List (not for usage questions)</div><div>Unsubscribe: <a href="mailto:OpenStack-dev-request@lists.openstack.org?subject:unsubscribe">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a></div><div><a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a></div></div></div></span>
</blockquote>
<div>
<br>
</div>