[openstack-dev] [neutron] high dhcp lease times in neutron deployments considered harmful (or not???)

Salvatore Orlando sorlando at nicira.com
Wed Jan 28 16:52:32 UTC 2015


The patch Kevin points out increased the lease to 24 hours (which I agree
is as arbitrary as 2 minutes, 8 minutes, or 1 century) because it
introduced use of DHCPRELEASE message in the agent, which is supported by
dnsmasq (to the best of my knowledge) and is functionally similar to
FORCERENEW.

This should have provided resiliency against changes of IP address from the
Neutron API, as the agent would send a DHCPRELEASE message as the
notification was received. When we reviewed the patch we verified that a
number of client supported this message (to my shame I must admit I did not
consider windows clients however).

It seems like the problem perhaps is that DHCPRELEASE is actually not
working as expected, or not working all?

Salvatore

On 28 January 2015 at 14:55, Ihar Hrachyshka <ihrachys at redhat.com> wrote:

>  On 01/28/2015 09:50 AM, Kevin Benton wrote:
>
> Hi,
>
>  Approximately a year and a half ago, the default DHCP lease time in
> Neutron was increased from 120 seconds to 86400 seconds.[1] This was done
> with the goal of reducing DHCP traffic with very little discussion (based
> on what I can see in the review and bug report). While it it does indeed
> reduce DHCP traffic, I don't think any bug reports were filed showing that
> a 120 second lease time resulted in too much traffic or that a jump all of
> the way to 86400 seconds was required instead of a value in the same order
> of magnitude.
>
>
> I guess that would be a good case for FORCERENEW DHCP extension [1] though
> after digging thru dnsmasq code a bit, I doubt it supports the extension
> (though e.g. systemd dhcp client/server from networkd module do). Le sigh.
>
> [1]: https://tools.ietf.org/html/rfc3203
>
>
>  Why does this matter?
>
>  Neutron ports can be updated with a new IP address from the same subnet
> or another subnet on the same network. The port update will result in
> anti-spoofing iptables rule changes that immediately stop the old IP
> address from working on the host. This means the host is unreachable for
> 0-12 hours based on the current default lease time without manual
> intervention[2] (assuming half-lease length DHCP renewal attempts).
>
>  Why is this on the mailing list?
>
>  In an attempt to make the VMs usable in a much shorter timeframe
> following a Neutron port address change, I submitted a patch to reduce the
> default DHCP lease time to 8 minutes.[3] However, this was upsetting to
> several people,[4] so it was suggested I bring this discussion to the
> mailing list. The following are the high-level concerns followed by my
> responses:
>
>    - 8 minutes is arbitrary
>       - Yes, but it's no more arbitrary than 1440 minutes. I picked it as
>       an interval because it is still 4 times larger than the last short value,
>       but it still allows VMs to regain connectivity in <5 minutes in the event
>       their IP is changed. If someone has a good suggestion for another interval
>       based on known dnsmasq QPS limits or some other quantitative reason, please
>       chime in here.
>
>
I think there little to no point in arguing about an optimal default lease
time. Simply because there isn't. If you want to move that to 8 minutes,
that's fine for me.

>
>     - other datacenters use long lease times
>       - This is true, but it's not really a valid comparison. In most
>       regular datacenters, updating a static DHCP lease has no effect on the data
>       plane so it doesn't matter that the client doesn't react for hours/days
>       (even with DHCP snooping enabled). However, in Neutron's case, the security
>       groups are immediately updated so all traffic using the old address is
>       blocked.
>
> Kevin's comment here is totally reasonable, but implies that the devised
mechanisms based on DHCPRELEASE is not working!


>
>     - dhcp traffic is scary because it's broadcast
>       - ARP traffic is also broadcast and many clients will expire
>       entries every 5-10 minutes and re-ARP. L2population may be used to prevent
>       ARP propagation, so the comparison between DHCP and ARP isn't always
>       relevant here.
>
>
I think this is a bit of a moot point. What's the impact of DHCP traffic,
even the DHCPDISCOVER broadcast on the overall traffic on a network? It's
not like a DHCP packet is a train of several hundreds ethernet frames,
isn't it?


>
>
>
>  Please reply back with your opinions/anecdotes/data related to short
> DHCP lease times.
>
>  Cheers
>
>  1.
> https://github.com/openstack/neutron/commit/d9832282cf656b162c51afdefb830dacab72defe
> 2. Manual intervention could be an instance reboot, a dhcp client
> invocation via the console, or a delayed invocation right before the
> update. (all significantly more difficult to script than a simple update of
> a port's IP via the API).
> 3. https://review.openstack.org/#/c/150595/
> 4. http://i.imgur.com/xtvatkP.jpg
>
>  --
>  Kevin Benton
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribehttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150128/95db8cd6/attachment.html>


More information about the OpenStack-dev mailing list