[OpenStack-docs] [training-labs] Networking problems

Roger Luethi rl at patchworkscience.org
Sun Nov 29 15:40:12 UTC 2015


On Sun, 29 Nov 2015 21:40:10 +0900, Bernd Bausch wrote:
> Trying to run the training labs from
> http://git.openstack.org/cgit/openstack/training-labs, I hit two problems.
> 
> First, the showstopper: I get 
> 
> ERROR: port is in limbo and won't recover:
> {"_date":1448791388943,"Port":{"aa26a099-6644-4a3a-b0e3-ec3de8659598":{"tag"
> :4095}},"_comment":"ovs-vsctl: /usr/bin/ovs-vsctl --timeout=10 --oneline
> --format=json -- set Port qr-2bd0cc52-07
> tag=4095","Open_vSwitch":{"c67ee198-2b9d-4dc4-a0ad-013cd3a39f24":{"next_cfg"
> :15}}}
> {"_date":1448791389765,"Port":{"209d3eee-9c5d-426f-87c0-0efeffce9ebb":{"tag"
> :4095}},"_comment":"ovs-vsctl: /usr/bin/ovs-vsctl --timeout=10 --oneline
> --format=json -- set Port qg-9c2afbd1-15
> tag=4095","Open_vSwitch":{"c67ee198-2b9d-4dc4-a0ad-013cd3a39f24":{"next_cfg"
> :16}}}
> 
> in the 038_00_setup_neutron_network.auto log file. It seems that
> installation of the network node failed. Where do I start debugging this?

The script that failed is this: scripts/ubuntu/setup_neutron_network.sh
The error message starts at line 202.

Starting at line 33, you find a comment that we added when we hit this
error with Juno. We got the same problem again with Kilo (but for a
different reason) -- see the comment on line 190.

Both times, the cause was a race condition. The script was moving too fast
for the system. If interfaces/bridge are not ready when the
neutron-l3-agent comes up, it marks them as down permanently and you'd have
to manually fiddle with the database to fix it (rebooting the services or
the VMs won't help). I guess we could add a function to fix the database if
the race keeps hitting us, but I'd rather fix the race.

You can slow down the script with sleep in some places, just to see if that
helps. If you want to investigate within the VM, you can add a line
"wait_for_file" wherever you want the script to pause. You can log into the
VM and work for as long as you have to. Delete /tmp/remove_to_continue and
the script will continue.

> Second, a problem that I can work around by myself but that defeats the
> intention of convenient and easy installation. I am unable to ssh into the
> any of the three nodes because their eth0 IP addresses aren't reachable. On
> my system, they all have the same address 10.0.2.15/24, and I find no trace
> of this network in iptables or the routing table. The DHCP server where this
> address comes from is 10.0.2.2. I don't know Virtualbox well enough to find
> out how it was configured.
> Perhaps the KVM network configuration (two networks 10.0.0.0/24 and
> 10.0.1.0/24) on my system gets into Virtualbox's way. 
> Here again - where should I start looking?

osbash configures port forwarding for you. In order to log into the
controller node, use:
$ ssh -p 2230 osbash at localhost

For network, it's port 2231, for compute, port 2232.

Alternatively (starting with VirtualBox 5), you can just open a console on
any running VM (in the VirtualBox GUI's VM context menu, select "Show").

Roger



More information about the OpenStack-docs mailing list