[Openstack] instances loosing IP address while running, due to No DHCPOFFER

Christian Parpart trapni at gmail.com
Fri Jun 15 23:19:08 UTC 2012


Hey all,

it now just happened twice again, both just today. and the last at 22:00
UTC, with
the following in the nova-network's syslog:

root at gw1:/var/log# grep 'dnsmasq.*10889' daemon.log
Jun 15 17:39:32 cesar1 dnsmasq[10889]: started, version v2.62-7-g4ce4f37
cachesize 150
Jun 15 17:39:32 cesar1 dnsmasq[10889]: compile time options: IPv6
GNU-getopt no-DBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP no-conntrack
Jun 15 17:39:32 cesar1 dnsmasq-dhcp[10889]: DHCP, static leases only on
10.10.40.3, lease time 3d
Jun 15 17:39:32 cesar1 dnsmasq[10889]: reading /etc/resolv.conf
Jun 15 17:39:32 cesar1 dnsmasq[10889]: using nameserver 4.2.2.1#53
Jun 15 17:39:32 cesar1 dnsmasq[10889]: using nameserver 178.63.26.173#53
Jun 15 17:39:32 cesar1 dnsmasq[10889]: using nameserver 192.168.2.122#53
Jun 15 17:39:32 cesar1 dnsmasq[10889]: using nameserver 192.168.2.121#53
Jun 15 17:39:32 cesar1 dnsmasq[10889]: read /etc/hosts - 519 addresses
Jun 15 17:39:32 cesar1 dnsmasq-dhcp[10889]: read
/var/lib/nova/networks/nova-br100.conf
Jun 15 21:59:41 cesar1 dnsmasq-dhcp[10889]: DHCPREQUEST(br100) 10.10.40.16
fa:16:3e:3d:ff:f3
Jun 15 21:59:41 cesar1 dnsmasq-dhcp[10889]: DHCPACK(br100) 10.10.40.16
fa:16:3e:3d:ff:f3 redis-appdata1

it seemed that this once VM was the only one who sent a dhcp request over
the past 5 hours,
and that first wone got replied with dhcp ack, and that is it.
That's been the time the host behind that IP (redis-appdata1) stopped
functioning.

However, I now actually did update dnsmasq on our gateway note, to latest
trunk
of dnsmasq git repository, killed dnsmasq, restarted nova-network (which
auto-starts dnsmasq per
device).

Now, I really hoped that this one particular bug fix was the cause of the
downtime,
but appearently, thet MIGHT be another factor.

There is unfortunately nothing to read in the VM's syslog.
What else could cause the VM to forget its IP?
Can this also be caused by send_arp_for_ha=True?

Regards,
Christian.

Christian.
On Fri, Jun 15, 2012 at 2:50 AM, Nathanael Burton <
nathanael.i.burton at gmail.com> wrote:

> FWIW I haven't run across the dnsmasq bug in our environment using EPEL
> packages.
>
> Nate
> On Jun 14, 2012 7:20 PM, "Vishvananda Ishaya" <vishvananda at gmail.com>
> wrote:
>
>> Are you running in VLAN mode? If so, you probably need to update to a new
>> version of dnsmasq.  See this message for reference:
>>
>> http://osdir.com/ml/openstack-cloud-computing/2012-05/msg00785.html
>>
>> Vish
>>
>> On Jun 14, 2012, at 1:41 PM, Christian Parpart wrote:
>>
>> Hey all,
>>
>> I feel really sad with saying this, now, that we have quite a few
>> instances in producgtion
>> since about 5 days at least, I now have encountered the second instance
>> loosing its
>> IP address due to "No DHCPOFFER" (as of syslog in the instance).
>>
>> I checked the logs in the central nova-network and gateway node and found
>> dnsmasq still to reply on requests from all the other instances and it
>> even
>> got the request from the instance in question and even sent an OFFER, as
>> of what
>> I can tell by now (i'm investigating / posting logs asap), but while it
>> seemed
>> that the dnsmasq sends an offer, the instances says it didn't receive one
>> - wtf?
>>
>> Please tell me what I can do to actually *fix* this issue, since this is
>> by far very fatal.
>>
>> One chance I'd see (as a workaround) is, to let created instanced retrieve
>> its IP via dhcp, but then reconfigure /etc/network/instances to continue
>> with
>> static networking setup. However, I'd just like the dhcp thingy to get
>> fixed.
>>
>> I'm very open to any kind of helping comments, :)
>>
>> So long,
>> Christian.
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~openstack
>> Post to     : openstack at lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>>
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~openstack
>> Post to     : openstack at lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20120616/c2f9772d/attachment.html>


More information about the Openstack mailing list