[openstack-dev] [neutron] - dnsmasq 'dhcp-authoritative' option broke multiple DHCP servers

Carl Baldwin carl at ecbaldwin.net
Tue May 26 19:37:59 UTC 2015


On Tue, May 26, 2015 at 11:05 AM, Brian Haley <brian.haley at hp.com> wrote:
> On 05/26/2015 01:12 PM, Salvatore Orlando wrote:
>>
>>  From the bug Kevin reported it seems multiple dhcp agents per network
>> have been
>> completely broken by the fix for bug #1345947, so a revert of patch [1]
>> (and
>> stable backports) should probably be the first thing to do - if nothing
>> else
>> because the original bug has not nearly the same level of severity of the
>> one it
>> introduced.

As long as we confirm that the severity of this bug is accurately
represented in the bug report, then this is the first thing we should
do.  However, see below.  We tried this and did not encounter the
error in at least one experiment.  Are we sure that this is broken
everywhere multiple servers is used?  I'm checking internally to
confirm that we have run this successfully.

>> Before doing this however, I am wondering why the various instances of
>> dnsmasq
>> end up returning NAKs. I expect all instances to have the same hosts file,
>> so
>> they should be able to respond to DHCPDISCOVER/DHCPREQUEST correctly. Is
>> the
>> dnsmasq log telling us exactly why the authoritative setting is preventing
>> us
>> from doing so? (this is more of a curiosity in my side)
>>
>> [1] https://review.openstack.org/#/c/152080/

I also think we should understand more about this problem.  I think
that understanding more specifics around the bug will help.  The
details are a bit unclear to me.

> In the original case, the DHCPREQUEST is for a renew, which is different
> than for an initial request.  If the server does not have a lease entry
> (which it won't after a restart), then it will NAK, which normally just
> causes the client to retry at INIT state.
>
> I had asked on the dnsmasq list about this [1], and the multiple server
> question was the wildcard, my testing didn't see the error described in the
> new bug though.  I guess the first proposed fix of re-populating the lease
> information doesn't seem like such a bad idea any more, but I will reply to
> my original query with the tcpdump information since I'm confused as to why
> the second dhcp agent stepped-in with a NAK at all after originally offering
> the same address as the first dhcp agent [2].

I remember being concerned about the multiple dnsmasq case.  I also
remember having tried it and thought that it was working as expected.

> I would agree the best thing to do is revert the stable backports while we
> work on fixing this in the master branch.

I think we can propose the reverts but until we confirm the severity
of this bug, I don't want them to merge.

Carl



More information about the OpenStack-dev mailing list