[Openstack] Trouble connecting to a new VM

Tyler Couto tcouto at certain.com
Fri Nov 20 23:36:31 UTC 2015


Perfect! Thank you! That port's state is binding_failed. I think what has
happened is that this port was created and then I changed the hostname of
the controller/network node. Now this port trying to bind to a host that
doesn’t exist anymore. This is not the first issue I’ve had with the
change in hostname, so I might just reinstall everything to avoid anymore
landmines.

I tried to fix everything first by changing sql to match my
hostname. I couldn’t get it to change from binding_failed to something
more reasonable, so I also ran neutron-netns-cleanup in the hope that it
might rebind and automagically fix things. It actually gets rid of all
your network namespaces and a lot of your ovs ports. I’m reinstalling now,
haha.

Thanks!
Tyler

On 11/20/15, 10:50 AM, "James Denton" <james.denton at rackspace.com> wrote:

>Hi Tyler,
>
>Unfortunately, you won’t be able to perform packet captures on the
>bridges using tcpdump. Instead, consider performing the captures on the
>interfaces in the provider bridge (like enp4s0f0), and/or the interface
>that has the local vtep address (tunnel ip) configured on it. You should
>see GRE encapsulated traffic which will need to be peeked into.
>
>Looking at the older bridge output you provided, I’m wondering if
>something is wrong with the DHCP port. See comments within:
>
>>>>> Bridge br-tun
>>>>> fail_mode: secure
>>>>>   Port "gre-ac10183d"
>>>>>     Interface "gre-ac10183d"
>>>>>       type: gre
>>>>>       options: {df_default="true", in_key=flow,
>>>>> local_ip="172.16.24.60",
>>>>> out_key=flow, remote_ip="172.16.24.61"}
>>>>>   Port gre-mirror
>>>>>     Interface gre-mirror
>>>>>   Port br-tun
>>>>>     Interface br-tun
>>>>>       type: internal
>>>>>   Port patch-int
>>>>>     Interface patch-int
>>>>>       type: patch
>>>>>       options: {peer=patch-tun}
>>>>> 
>
>>>>> Bridge br-ex
>>>>>   Port "enp4s0f0"
>>>>>     Interface “enp4s0f0”
>
>^^ One “physical” interface (enp4s0f0)
>
>>>>>   Port phy-br-ex
>>>>>     Interface phy-br-ex
>>>>>       type: patch
>>>>>       options: {peer=int-br-ex}
>>>>>   Port br-ex
>>>>>     Interface br-ex
>>>>>       type: internal
>>>>>   Port "enp4s0f1"
>>>>>     Interface “enp4s0f1”
>
>^^ Another “physical” interface (enp4s0f1). Be careful with that, you may
>create a bridging loop
>
>>>>> Bridge br-int
>>>>> fail_mode: secure
>>>>>   Port "qr-a81f0614-0e"
>>>>>     tag: 2
>>>>>     Interface "qr-a81f0614-0e"
>>>>>       type: internal
>>>>>   Port "qg-289ea4d2-29"
>>>>>     tag: 5
>>>>>     Interface "qg-289ea4d2-29"
>>>>>       type: internal
>
>^^ The qg and qr ports are connected to the router namespace
>
>>>>>   Port br-int
>>>>>     Interface br-int
>>>>>       type: internal
>>>>>   Port patch-tun
>>>>>     Interface patch-tun
>>>>>       type: patch
>>>>>       options: {peer=patch-int}
>>>>>   Port int-br-ex
>>>>>     Interface int-br-ex
>>>>>       type: patch
>>>>>       options: {peer=phy-br-ex}
>
>>>>>   Port "tap468d3ee4-c0"
>>>>>     tag: 4095
>>>>>     Interface "tap468d3ee4-c0"
>>>>>       type: internal
>
>^^ this port *may* for the DHCP namespace. A tag of 4095 means that the
>agent was unable to find a corresponding Neutron port or some other
>failure occurred. Take a look inside the DHCP namespace and see if the
>interface name corresponds with this one (the 10-char ID would be the
>same). Perform a ‘neutron port-list | grep <10-char id>’ and see what is
>returned. If a port is returned, do a ‘neutron port-show <full ID>’ and
>see if the state is ‘binding failed’. If so, there may be some
>misconfiguration keeping the DHCP port for being created. You can try
>unscheduling the network from the DHCP agent, deleting the port, and
>rescheduling the network to see if that fixes things. If that port
>*doesn’t* correspond to the DHCP namespace I would be highly surprised.
>
>For starters, try creating another GRE network, boot some instances, and
>see if the same failure occurs. Use ‘ovs-vsctl show’ to see if a new port
>was added on the controller/network node and if so, what the tag is.
>
>James
>
>> On Nov 20, 2015, at 11:25 AM, Tyler Couto <tcouto at certain.com> wrote:
>> 
>> One more hint: The problem seems to be on the controller side. I added
>>the
>> compute role to the controller/network node and booted a VM on it, and I
>> still can’t get an IP address from the DHCP server.
>> Andreas, in regards to your comment about qrxxx and qgxxx (I think you
>> mean qgxxx, because I don’t have a port named qqxxx), this was setup by
>> the openstack software, and I did not modify it. Furthermore, since
>>these
>> bridges all have patches between them, shouldn’t they really act like a
>> single bridge? I assumed the separation was just for ease of
>>understanding
>> and better organization.
>> I also wasn’t able to get vnc working on the compute host. I assume this
>> is probably because the VM doesn’t have an IP address other than the
>> loopback.
>> 
>> Tyler
>> 
>> 
>> 
>> On 11/18/15, 3:56 PM, "Tyler Couto" <tcouto at certain.com> wrote:
>> 
>>> Ok, we’ve figured out that my VM is not getting an ip address. Here’s
>>>the
>>> dhcp part of the console.log:
>>> Starting network...
>>> udhcpc (v1.20.1) started
>>> Sending discover...
>>> Sending discover...
>>> Sending discover...
>>> Usage: /sbin/cirros-dhcpc <up|down>
>>> No lease, failing
>>> 
>>> 
>>> I looked at the dnsmasq logs after a VM reboot, and I also straced the
>>> dnsmasq process during a VM reboot. Both show that dnsmasq isn’t doing
>>> anything when I reboot the machine. It should be giving out an ip
>>>address
>>> to my VM right?
>>> 
>>> I’ve read that GRE doesn’t work on kernels below 3.11, and I’m running
>>> CentOS 7 with 3.10, but I’ve also read otherwise.
>>> 
>>> I’m trying to see if this is a problem with the GRE tunnel, but I’m
>>> getting very confusing results. I’ll try to explain it. I have four
>>> tcpdumps running.
>>> On the compute node I have the following:
>>> 1. tcpdump -i br-int
>>> 2. tcpdump -i br-tun
>>> 3. tcpdump -i gre-mirror1 # <— This is a mirror of the gre port on
>>>br-tun
>>> 
>>> On the controller/network node I have the following:
>>> 1. tcpdump -i gre-mirror2 # <— Also a gre port mirror on br-tun of
>>> controller node
>>> 
>>> I’ve done a few things with this setup. I’ll try to explain a couple of
>>> them and tell you where I see traffic.
>>> 1. ping -I br-tun 192.168.1.1 # <— It shouldn’t matter where I send it
>>> right?
>>> - - I see identical ARP traffic on br-tun and gre-mirror1 (compute
>>>node),
>>> but no traffic on br-int and gre-mirror2
>>> - - 15:03:49.994644 ARP, Request who-has 192.168.1.1 tell
>>> openstack102.example.com, length 28
>>> 2. nova reboot demo-instance1
>>> - - I see identical BOOTPC/BOOTPS traffic on br-int and gre-mirror2
>>> (controller/network node), but no traffic on br-tun or gre-mirror1
>>> - - 15:26:06.583855 IP 0.0.0.0.bootpc > 255.255.255.255.bootps:
>>> BOOTP/DHCP, Request from fa:16:3e:1d:9a:9d (oui Unknown), length 290
>>> 
>>> The first test suggests that the gre tunnel is broken, and there’s
>>> something wrong with the patch between br-tun and br-int.
>>> The second test seems to show that the gre tunnel is working well.
>>> 
>>> What am I missing here? Is something terribly wrong with this test?
>>> 
>>> Thanks,
>>> Tyler
>>> 
>>> On 11/17/15, 12:58 PM, "James Denton" <james.denton at rackspace.com>
>>>wrote:
>>> 
>>>> Hi Tyler,
>>>> 
>>>> You might try verifying that the instance properly received its IP
>>>> address. You can try using ‘nova console-log <id>’ to view the console
>>>> log of the instance. Look for the cloud-init info. Also, take a look
>>>>at
>>>> the syslog of the network node to see if the DHCP request made it and
>>>>was
>>>> acknowledged. If it looks like it got its IP, try hitting the instance
>>>> from within the DHCP or router namespace to see if you can hit the
>>>>fixed
>>>> IP from something in the same network before trying to hit the
>>>>floating
>>>> IP. You may also want to run some packet captures on the respective
>>>>qbr
>>>> bridge and physical interfaces while doing these tests to see if/where
>>>> traffic is getting dropped.
>>>> 
>>>> James
>>>> 
>>>>> On Nov 17, 2015, at 11:31 AM, Tyler Couto <tcouto at certain.com> wrote:
>>>>> 
>>>>> Thanks Andreas. My security groups do allow icmp traffic.
>>>>> 
>>>>> 
>>>>>+---------+-----------------------------------------------------------
>>>>>--
>>>>> -
>>>>> --
>>>>> ------+
>>>>> | name    | security_group_rules
>>>>>     |
>>>>> 
>>>>> 
>>>>>+---------+-----------------------------------------------------------
>>>>>--
>>>>> -
>>>>> --
>>>>> ------+
>>>>> | default | egress, IPv4
>>>>>     |
>>>>> |         | egress, IPv6
>>>>>     |
>>>>> |         | ingress, IPv4, 22/tcp, remote_ip_prefix: 0.0.0.0/0
>>>>>     |
>>>>> |         | ingress, IPv4, icmp, remote_ip_prefix: 0.0.0.0/0
>>>>>     |
>>>>> |         | ingress, IPv4, remote_group_id:
>>>>> d404679b-aeed-4d2f-bea9-2c7d19ff3fb1 |
>>>>> |         | ingress, IPv6, remote_group_id:
>>>>> d404679b-aeed-4d2f-bea9-2c7d19ff3fb1 |
>>>>> +---------+‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹+
>>>>> 
>>>>> I can¹t access my VM¹s console, so I do not know whether I can ping
>>>>> from
>>>>> my VM. I figured this might be a related issue. I receive this error
>>>>>on
>>>>> when trying to access the noVNC console:
>>>>> Failed to connect to server (code: 1006)
>>>>> 
>>>>> 
>>>>> This is a two node setup. I have one controller/neutron-network node.
>>>>> Here¹s the output of 'ovs-vsctl show¹:
>>>>> 
>>>>> Bridge br-tun
>>>>> fail_mode: secure
>>>>>   Port "gre-ac10183d"
>>>>>     Interface "gre-ac10183d"
>>>>>       type: gre
>>>>>       options: {df_default="true", in_key=flow,
>>>>> local_ip="172.16.24.60",
>>>>> out_key=flow, remote_ip="172.16.24.61"}
>>>>>   Port gre-mirror
>>>>>     Interface gre-mirror
>>>>>   Port br-tun
>>>>>     Interface br-tun
>>>>>       type: internal
>>>>>   Port patch-int
>>>>>     Interface patch-int
>>>>>       type: patch
>>>>>       options: {peer=patch-tun}
>>>>>   Bridge br-ex
>>>>>   Port "enp4s0f0"
>>>>>     Interface "enp4s0f0"
>>>>>   Port phy-br-ex
>>>>>     Interface phy-br-ex
>>>>>       type: patch
>>>>>       options: {peer=int-br-ex}
>>>>>   Port br-ex
>>>>>     Interface br-ex
>>>>>       type: internal
>>>>>   Port "enp4s0f1"
>>>>>     Interface "enp4s0f1"
>>>>>   Bridge br-int
>>>>> fail_mode: secure
>>>>>   Port "qr-a81f0614-0e"
>>>>>     tag: 2
>>>>>     Interface "qr-a81f0614-0e"
>>>>>       type: internal
>>>>>   Port "qg-289ea4d2-29"
>>>>>     tag: 5
>>>>>     Interface "qg-289ea4d2-29"
>>>>>       type: internal
>>>>>   Port br-int
>>>>>     Interface br-int
>>>>>       type: internal
>>>>>   Port patch-tun
>>>>>     Interface patch-tun
>>>>>       type: patch
>>>>>       options: {peer=patch-int}
>>>>>   Port int-br-ex
>>>>>     Interface int-br-ex
>>>>>       type: patch
>>>>>       options: {peer=phy-br-ex}
>>>>>   Port "tap468d3ee4-c0"
>>>>>     tag: 4095
>>>>>     Interface "tap468d3ee4-c0"
>>>>>       type: internal
>>>>>   ovs_version: "2.3.1"
>>>>> 
>>>>> 
>>>>> I have on compute node. Here¹s the output of 'ovs-vsctl show':
>>>>> 
>>>>> Bridge br-int
>>>>> fail_mode: secure
>>>>>   Port "qvoc6d01e4b-1d"
>>>>>     tag: 1
>>>>>     Interface "qvoc6d01e4b-1d"
>>>>>   Port br-int
>>>>>     Interface br-int
>>>>>       type: internal
>>>>>   Port patch-tun
>>>>>     Interface patch-tun
>>>>>       type: patch
>>>>>       options: {peer=patch-int}
>>>>> Bridge br-tun
>>>>> fail_mode: secure
>>>>>   Port br-tun
>>>>>     Interface br-tun
>>>>>       type: internal
>>>>>   Port patch-int
>>>>>     Interface patch-int
>>>>>       type: patch
>>>>>       options: {peer=patch-tun}
>>>>>   Port "gre-ac10183c"
>>>>>     Interface "gre-ac10183c"
>>>>>       type: gre
>>>>>       options: {df_default="true", in_key=flow,
>>>>> local_ip="172.16.24.61",
>>>>> out_key=flow, remote_ip="172.16.24.60"}
>>>>>   Port gre-mirror
>>>>>     Interface gre-mirror
>>>>>   Port "tap0"
>>>>>     Interface "tap0"
>>>>>   ovs_version: "2.3.1"
>>>>> 
>>>>> 
>>>>> I also have a laptop on the same network as the openstack machines. I
>>>>> can
>>>>> successfully ping the interface of the neutron router from my laptop.
>>>>> 
>>>>> As far as the physical interfaces, I am only using one physical
>>>>> interface
>>>>> on each openstack machine. I know this is not the recommended setup,
>>>>> but
>>>>> since this is only a POC, I wanted to keep it simple.
>>>>> 
>>>>> -Tyler
>>>>> 
>>>>> 
>>>>> 
>>>>> On 11/17/15, 12:48 AM, "Andreas Scheuring"
>>>>> <scheuran at linux.vnet.ibm.com>
>>>>> wrote:
>>>>> 
>>>>>> ease check your Security Groups first.
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> Mailing list:
>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>>>> Post to     : openstack at lists.openstack.org
>>>>> Unsubscribe :
>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>> 
>> 



More information about the Openstack mailing list