[openstack-dev] [neutron] - dnsmasq 'dhcp-authoritative' option broke multiple DHCP servers

Brian Haley brian.haley at hp.com
Tue May 26 18:05:45 UTC 2015


On 05/26/2015 01:12 PM, Salvatore Orlando wrote:
>  From the bug Kevin reported it seems multiple dhcp agents per network have been
> completely broken by the fix for bug #1345947, so a revert of patch [1] (and
> stable backports) should probably be the first thing to do - if nothing else
> because the original bug has not nearly the same level of severity of the one it
> introduced.
> Before doing this however, I am wondering why the various instances of dnsmasq
> end up returning NAKs. I expect all instances to have the same hosts file, so
> they should be able to respond to DHCPDISCOVER/DHCPREQUEST correctly. Is the
> dnsmasq log telling us exactly why the authoritative setting is preventing us
> from doing so? (this is more of a curiosity in my side)
>
> [1] https://review.openstack.org/#/c/152080/

In the original case, the DHCPREQUEST is for a renew, which is different than 
for an initial request.  If the server does not have a lease entry (which it 
won't after a restart), then it will NAK, which normally just causes the client 
to retry at INIT state.

I had asked on the dnsmasq list about this [1], and the multiple server question 
was the wildcard, my testing didn't see the error described in the new bug 
though.  I guess the first proposed fix of re-populating the lease information 
doesn't seem like such a bad idea any more, but I will reply to my original 
query with the tcpdump information since I'm confused as to why the second dhcp 
agent stepped-in with a NAK at all after originally offering the same address as 
the first dhcp agent [2].

I would agree the best thing to do is revert the stable backports while we work 
on fixing this in the master branch.

-Brian

[1] http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2015q1/009171.html
[2] https://launchpadlibrarian.net/207180476/dhcp_neutron_bug.html


> On 26 May 2015 at 06:57, Ihar Hrachyshka <ihrachys at redhat.com
> <mailto:ihrachys at redhat.com>> wrote:
>
>     -----BEGIN PGP SIGNED MESSAGE-----
>     Hash: SHA256
>
>     On 05/26/2015 04:35 AM, Kevin Benton wrote:
>     > Hi,
>     >
>     > A recent change[1] to pass '--dhcp-authoritative' to dnsmasq has
>     > caused DHCPNAK messages when multiple agents are scheduled to a
>     > network [2].
>     >
>     > This was back-ported to Icehouse and Juno so we need a fix that is
>     > compatible with both of them.
>     >
>     > I have two fixes for this so far and a third alternative if we
>     > don't like those.
>     >
>     > The first is hacky, but it's only a few-line change.[3] It adds an
>     > iptables rule that just stops the DHCPNAKs from making it to the
>     > client. This is clean to back-port but it doesn't protect clients
>     > that have filtering disabled (e.g. bare metal).
>     >
>     > The second persists the DHCP leases to a database.[4] The downside
>     > to this was always that being rescheduled to another agent would
>     > mean no entries in the lease file. This approach adds a work-around
>     > to generate an initial fake lease file based on all of the ports in
>     > the network.
>     >
>     > A third approach that I don't have a patch pushed for yet is very
>     > similar to the second. When dnsmasq is in the leasefile-ro mode, it
>     > will call the script passed to --dhcp-script to get a list of
>     > leases to start with. This script would be built with the same
>     > logic as the second one. The only difference between the second
>     > approach is that dnsmasq wouldn't persist leases to a database.
>     >
>
>     Actually, that approach was initially taken for bug 1345947, but then
>     the patch was abandoned to be replaced with a simpler
>     - --dhcp-authoritative approach that ended up with unexpected NAKs for
>     multi agent setup.
>
>     See: https://review.openstack.org/#/c/108272/12
>
>     Maybe we actually want to restore the work and merge it after
>     conflicts are resolved and --dhcp-authoritative option is killed; the
>     patch was almost merged when --dhcp-authoritative suggestion emerged,
>     so most of nitpicking work should be complete now (though at the same
>     time, I totally trust our community to find another pile of nits to
>     work on for the next few weeks!)
>
>
> That was my thought as well.
> However, we should check whether that patch is ok to backport. For instance I
> see what it appears to be adding a script:
>
> [2]
> https://review.openstack.org/#/c/108272/12/bin/neutron-dhcp-agent-dnsmasq-lease-init
>
>
>     ===
>
>     Speaking of regression testing... Are full stack tests already
>     powerful enough for us to invoke multiple DHCP agents and test the
>     scenario?
>
>     Ihar
>     -----BEGIN PGP SIGNATURE-----
>     Version: GnuPG v2
>
>     iQEcBAEBCAAGBQJVZHvHAAoJEC5aWaUY1u57vukIAJLPpQ9O236NYtOaRTzkL7g8
>     Io1DmF6jyhJYFqfzoFcrFVbNmM0EsNtvMgZIhI8oYINkkoBYMJPoS2a8FvVUpZHw
>     u/fmdvdbZgJwy4BCAEF0t+R1t1XLo6eTcPp8f3jABzExWyrLoKEbHJ0aWb5xwJ3u
>     V74HXxo/PVifrNfxsQPn57ZxqgBvl4GSQAFQKE4FX/H81HWRWRuB5a9aC+hkYC9w
>     7FqXpf+IFCaS7tYdTSqJUa2/bKs268RQGoVqAYEtmVV5pA3OiMsy459rdLcHqqxS
>     67lryFh1DTMwI77LjDEanXzWIdMhb3t0YZw7ewpBBLl6P/Lh7xobIOGX2GeOyJ0=
>     =xivW
>     -----END PGP SIGNATURE-----
>
>     __________________________________________________________________________
>     OpenStack Development Mailing List (not for usage questions)
>     Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>     <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>




More information about the OpenStack-dev mailing list