[Openstack] Need help - Compute Node restarted - Ramdom Instances doesn't get and IP anymore

Martinx - ジェームズ thiagocmartinsc at gmail.com
Mon Nov 18 21:09:25 UTC 2013


I did nothing besides a Compute Node reboot... Everything is back to normal
now, few hours after restarting it...

It is easily to reproduce this, every time I reboot a Compute Node, some
instances doesn't get its IP... Need to wait hours to get it back to
normal, without any intervention...

Unfortunately, Havana have a LOT of network problems, I believe that all of
its problems is related to "Per-Tenant Router with Private Networks"
topology, it is almost useless (it is definitely not production-ready (this
topology)), I mean, I'm using it here but, I'm "living on the edge" with
Havana... One more problematic step, my cloud is dead...

Nothing unusual appeared at the Network Node DHCP log during this outage. I
can enable the DEBUG and restart the Compute Node again, to see if I catch
something but, it will interfere with my running tenants...

Tks!
Thiago


On 18 November 2013 16:50, Razique Mahroua <razique.mahroua at gmail.com>wrote:

> now, that's interesting, you didn't even restarted a service?
> Did you found something into the dhcp-agent logs?
>
> On 18 Nov 2013, at 10:46, Martinx - ジェームズ wrote:
>
> > Thank you Razique!
> >
> > Out of nothing, all instances gets its IP automatically again, without
> even
> > restarting it... Have no idea about what had happened.
> >
> > But, this is very weird, every time I restart a compute node, those
> network
> > problems appear... No idea about the source of this problem...
> >
> > Tks again!
> >
> > Best,
> > Thiago
> >
> >
> > On 18 November 2013 16:36, Razique Mahroua <razique.mahroua at gmail.com
> >wrote:
> >
> >> Check the dhcp-agent logs especially when you force a dhcp renew on
> these
> >> instances.
> >> Meanwhile, use tcpdump with:
> >> tcpdump -i ROUTER-INTERFACE -vvv -s 1500 '((port 67 or port 68) and
> >> (udp[8:1] = 0x1))'
> >>
> >> if you want to check the DHCP paquets for a particular instance, get its
> >> mac and:
> >> tcpdump -i ROUTER-INTERFACE -vvv -s 1500 '((port 67 or port 68) and
> >> (udp[38:4] = 0xMAC-ADDR))'
> >>
> >> Razique
> >>
> >> On 18 Nov 2013, at 10:10, Martinx - ジェームズ wrote:
> >>
> >> Okay... I'm calm... :-P
> >>
> >> This is the second time I'm seeing this with Havana.
> >>
> >> Compute Node reboots, lots of Instances doesn't get its IP anymore,
> look:
> >> ------------------------------
> >>
> >> cloud-init start-local running: Mon, 18 Nov 2013 16:34:06 +0000. up
> 18.19
> >> seconds
> >> no instance data found in start-local
> >> cloud-init-nonet waiting 120 seconds for a network device.
> >> cloud-init-nonet gave up waiting for a network device.
> >> ci-info: lo : 1 127.0.0.1 255.0.0.0 .
> >> ci-info: eth0 : 1 . . fa:16:3e:a2:71:74
> >> route_info failed
> >> Waiting for network configuration...
> >> Waiting up to 60 more seconds for network configuration...
> >> Booting system without full network configuration...
> >>
> >> New Instances that I launch right now, get its IP normally.
> >>
> >> The only way to put the website online again now is: take a snapshot of
> a
> >> Instance without IP, launch a new instance based on that image, voialá!
> >> Instance gets its IP again... But this is non-viable.
> >>
> >> I appreciate any help!
> >>
> >> Tks!
> >> Thiago
> >>
> >> On 18 November 2013 15:35, Razique Mahroua
> razique.mahroua at gmail.comwrote:
> >>
> >> Hey Martin :)
> >> On 18 Nov 2013, at 8:40, Martinx - ジェームズ wrote:
> >>
> >> Guys,
> >>
> >> My Havana (Ubuntu based) Compute Node was restarted and lots of
> Instances
> >> does not get an IP anymore.
> >>
> >> Tips?!
> >>
> >> Stay clam
> >>
> >> It is ramdom, I mean, some instances of this same compute node are
> >>
> >> normal,
> >>
> >> while others have no IP.
> >>
> >> Are you reffering to the private IPs pool or the public. When you say
> >> "don't have", you mean they don't get allocated or the instances (DHCP?
> )
> >> don't retrieve it?
> >>
> >> I really need help here because my client's web site is completely off
> >>
> >> line
> >>
> >> now...
> >>
> >> I'm using Per-Tenant router with private networks + VXLAN.
> >>
> >> Tks!
> >> Thiago
> >> ------------------------------
> >>
> >> Mailing list:
> >>
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> >>
> >> Post to : openstack at lists.openstack.org
> >> Unsubscribe :
> >>
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> >>
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20131118/c8e87904/attachment.html>


More information about the Openstack mailing list