[openstack-dev] [neutron] - dnsmasq 'dhcp-authoritative' option broke multiple DHCP servers
Brian Haley
brian.haley at hp.com
Tue May 26 18:05:45 UTC 2015
On 05/26/2015 01:12 PM, Salvatore Orlando wrote:
> From the bug Kevin reported it seems multiple dhcp agents per network have been
> completely broken by the fix for bug #1345947, so a revert of patch [1] (and
> stable backports) should probably be the first thing to do - if nothing else
> because the original bug has not nearly the same level of severity of the one it
> introduced.
> Before doing this however, I am wondering why the various instances of dnsmasq
> end up returning NAKs. I expect all instances to have the same hosts file, so
> they should be able to respond to DHCPDISCOVER/DHCPREQUEST correctly. Is the
> dnsmasq log telling us exactly why the authoritative setting is preventing us
> from doing so? (this is more of a curiosity in my side)
>
> [1] https://review.openstack.org/#/c/152080/
In the original case, the DHCPREQUEST is for a renew, which is different than
for an initial request. If the server does not have a lease entry (which it
won't after a restart), then it will NAK, which normally just causes the client
to retry at INIT state.
I had asked on the dnsmasq list about this [1], and the multiple server question
was the wildcard, my testing didn't see the error described in the new bug
though. I guess the first proposed fix of re-populating the lease information
doesn't seem like such a bad idea any more, but I will reply to my original
query with the tcpdump information since I'm confused as to why the second dhcp
agent stepped-in with a NAK at all after originally offering the same address as
the first dhcp agent [2].
I would agree the best thing to do is revert the stable backports while we work
on fixing this in the master branch.
-Brian
[1] http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2015q1/009171.html
[2] https://launchpadlibrarian.net/207180476/dhcp_neutron_bug.html
> On 26 May 2015 at 06:57, Ihar Hrachyshka <ihrachys at redhat.com
> <mailto:ihrachys at redhat.com>> wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> On 05/26/2015 04:35 AM, Kevin Benton wrote:
> > Hi,
> >
> > A recent change[1] to pass '--dhcp-authoritative' to dnsmasq has
> > caused DHCPNAK messages when multiple agents are scheduled to a
> > network [2].
> >
> > This was back-ported to Icehouse and Juno so we need a fix that is
> > compatible with both of them.
> >
> > I have two fixes for this so far and a third alternative if we
> > don't like those.
> >
> > The first is hacky, but it's only a few-line change.[3] It adds an
> > iptables rule that just stops the DHCPNAKs from making it to the
> > client. This is clean to back-port but it doesn't protect clients
> > that have filtering disabled (e.g. bare metal).
> >
> > The second persists the DHCP leases to a database.[4] The downside
> > to this was always that being rescheduled to another agent would
> > mean no entries in the lease file. This approach adds a work-around
> > to generate an initial fake lease file based on all of the ports in
> > the network.
> >
> > A third approach that I don't have a patch pushed for yet is very
> > similar to the second. When dnsmasq is in the leasefile-ro mode, it
> > will call the script passed to --dhcp-script to get a list of
> > leases to start with. This script would be built with the same
> > logic as the second one. The only difference between the second
> > approach is that dnsmasq wouldn't persist leases to a database.
> >
>
> Actually, that approach was initially taken for bug 1345947, but then
> the patch was abandoned to be replaced with a simpler
> - --dhcp-authoritative approach that ended up with unexpected NAKs for
> multi agent setup.
>
> See: https://review.openstack.org/#/c/108272/12
>
> Maybe we actually want to restore the work and merge it after
> conflicts are resolved and --dhcp-authoritative option is killed; the
> patch was almost merged when --dhcp-authoritative suggestion emerged,
> so most of nitpicking work should be complete now (though at the same
> time, I totally trust our community to find another pile of nits to
> work on for the next few weeks!)
>
>
> That was my thought as well.
> However, we should check whether that patch is ok to backport. For instance I
> see what it appears to be adding a script:
>
> [2]
> https://review.openstack.org/#/c/108272/12/bin/neutron-dhcp-agent-dnsmasq-lease-init
>
>
> ===
>
> Speaking of regression testing... Are full stack tests already
> powerful enough for us to invoke multiple DHCP agents and test the
> scenario?
>
> Ihar
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2
>
> iQEcBAEBCAAGBQJVZHvHAAoJEC5aWaUY1u57vukIAJLPpQ9O236NYtOaRTzkL7g8
> Io1DmF6jyhJYFqfzoFcrFVbNmM0EsNtvMgZIhI8oYINkkoBYMJPoS2a8FvVUpZHw
> u/fmdvdbZgJwy4BCAEF0t+R1t1XLo6eTcPp8f3jABzExWyrLoKEbHJ0aWb5xwJ3u
> V74HXxo/PVifrNfxsQPn57ZxqgBvl4GSQAFQKE4FX/H81HWRWRuB5a9aC+hkYC9w
> 7FqXpf+IFCaS7tYdTSqJUa2/bKs268RQGoVqAYEtmVV5pA3OiMsy459rdLcHqqxS
> 67lryFh1DTMwI77LjDEanXzWIdMhb3t0YZw7ewpBBLl6P/Lh7xobIOGX2GeOyJ0=
> =xivW
> -----END PGP SIGNATURE-----
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
More information about the OpenStack-dev
mailing list