[Openstack] Trouble connecting to a new VM

James Denton james.denton at rackspace.com
Fri Nov 20 18:50:39 UTC 2015


Hi Tyler,

Unfortunately, you won’t be able to perform packet captures on the bridges using tcpdump. Instead, consider performing the captures on the interfaces in the provider bridge (like enp4s0f0), and/or the interface that has the local vtep address (tunnel ip) configured on it. You should see GRE encapsulated traffic which will need to be peeked into.

Looking at the older bridge output you provided, I’m wondering if something is wrong with the DHCP port. See comments within:

>>>> Bridge br-tun
>>>> fail_mode: secure
>>>>   Port "gre-ac10183d"
>>>>     Interface "gre-ac10183d"
>>>>       type: gre
>>>>       options: {df_default="true", in_key=flow,
>>>> local_ip="172.16.24.60",
>>>> out_key=flow, remote_ip="172.16.24.61"}
>>>>   Port gre-mirror
>>>>     Interface gre-mirror
>>>>   Port br-tun
>>>>     Interface br-tun
>>>>       type: internal
>>>>   Port patch-int
>>>>     Interface patch-int
>>>>       type: patch
>>>>       options: {peer=patch-tun}
>>>> 

>>>> Bridge br-ex
>>>>   Port "enp4s0f0"
>>>>     Interface “enp4s0f0”

^^ One “physical” interface (enp4s0f0)

>>>>   Port phy-br-ex
>>>>     Interface phy-br-ex
>>>>       type: patch
>>>>       options: {peer=int-br-ex}
>>>>   Port br-ex
>>>>     Interface br-ex
>>>>       type: internal
>>>>   Port "enp4s0f1"
>>>>     Interface “enp4s0f1”

^^ Another “physical” interface (enp4s0f1). Be careful with that, you may create a bridging loop

>>>> Bridge br-int
>>>> fail_mode: secure
>>>>   Port "qr-a81f0614-0e"
>>>>     tag: 2
>>>>     Interface "qr-a81f0614-0e"
>>>>       type: internal
>>>>   Port "qg-289ea4d2-29"
>>>>     tag: 5
>>>>     Interface "qg-289ea4d2-29"
>>>>       type: internal

^^ The qg and qr ports are connected to the router namespace

>>>>   Port br-int
>>>>     Interface br-int
>>>>       type: internal
>>>>   Port patch-tun
>>>>     Interface patch-tun
>>>>       type: patch
>>>>       options: {peer=patch-int}
>>>>   Port int-br-ex
>>>>     Interface int-br-ex
>>>>       type: patch
>>>>       options: {peer=phy-br-ex}

>>>>   Port "tap468d3ee4-c0"
>>>>     tag: 4095
>>>>     Interface "tap468d3ee4-c0"
>>>>       type: internal

^^ this port *may* for the DHCP namespace. A tag of 4095 means that the agent was unable to find a corresponding Neutron port or some other failure occurred. Take a look inside the DHCP namespace and see if the interface name corresponds with this one (the 10-char ID would be the same). Perform a ‘neutron port-list | grep <10-char id>’ and see what is returned. If a port is returned, do a ‘neutron port-show <full ID>’ and see if the state is ‘binding failed’. If so, there may be some misconfiguration keeping the DHCP port for being created. You can try unscheduling the network from the DHCP agent, deleting the port, and rescheduling the network to see if that fixes things. If that port *doesn’t* correspond to the DHCP namespace I would be highly surprised.

For starters, try creating another GRE network, boot some instances, and see if the same failure occurs. Use ‘ovs-vsctl show’ to see if a new port was added on the controller/network node and if so, what the tag is.

James

> On Nov 20, 2015, at 11:25 AM, Tyler Couto <tcouto at certain.com> wrote:
> 
> One more hint: The problem seems to be on the controller side. I added the
> compute role to the controller/network node and booted a VM on it, and I
> still can’t get an IP address from the DHCP server.
> Andreas, in regards to your comment about qrxxx and qgxxx (I think you
> mean qgxxx, because I don’t have a port named qqxxx), this was setup by
> the openstack software, and I did not modify it. Furthermore, since these
> bridges all have patches between them, shouldn’t they really act like a
> single bridge? I assumed the separation was just for ease of understanding
> and better organization.
> I also wasn’t able to get vnc working on the compute host. I assume this
> is probably because the VM doesn’t have an IP address other than the
> loopback.
> 
> Tyler
> 
> 
> 
> On 11/18/15, 3:56 PM, "Tyler Couto" <tcouto at certain.com> wrote:
> 
>> Ok, we’ve figured out that my VM is not getting an ip address. Here’s the
>> dhcp part of the console.log:
>> Starting network...
>> udhcpc (v1.20.1) started
>> Sending discover...
>> Sending discover...
>> Sending discover...
>> Usage: /sbin/cirros-dhcpc <up|down>
>> No lease, failing
>> 
>> 
>> I looked at the dnsmasq logs after a VM reboot, and I also straced the
>> dnsmasq process during a VM reboot. Both show that dnsmasq isn’t doing
>> anything when I reboot the machine. It should be giving out an ip address
>> to my VM right?
>> 
>> I’ve read that GRE doesn’t work on kernels below 3.11, and I’m running
>> CentOS 7 with 3.10, but I’ve also read otherwise.
>> 
>> I’m trying to see if this is a problem with the GRE tunnel, but I’m
>> getting very confusing results. I’ll try to explain it. I have four
>> tcpdumps running.
>> On the compute node I have the following:
>> 1. tcpdump -i br-int
>> 2. tcpdump -i br-tun
>> 3. tcpdump -i gre-mirror1 # <— This is a mirror of the gre port on br-tun
>> 
>> On the controller/network node I have the following:
>> 1. tcpdump -i gre-mirror2 # <— Also a gre port mirror on br-tun of
>> controller node
>> 
>> I’ve done a few things with this setup. I’ll try to explain a couple of
>> them and tell you where I see traffic.
>> 1. ping -I br-tun 192.168.1.1 # <— It shouldn’t matter where I send it
>> right?
>> - - I see identical ARP traffic on br-tun and gre-mirror1 (compute node),
>> but no traffic on br-int and gre-mirror2
>> - - 15:03:49.994644 ARP, Request who-has 192.168.1.1 tell
>> openstack102.example.com, length 28
>> 2. nova reboot demo-instance1
>> - - I see identical BOOTPC/BOOTPS traffic on br-int and gre-mirror2
>> (controller/network node), but no traffic on br-tun or gre-mirror1
>> - - 15:26:06.583855 IP 0.0.0.0.bootpc > 255.255.255.255.bootps:
>> BOOTP/DHCP, Request from fa:16:3e:1d:9a:9d (oui Unknown), length 290
>> 
>> The first test suggests that the gre tunnel is broken, and there’s
>> something wrong with the patch between br-tun and br-int.
>> The second test seems to show that the gre tunnel is working well.
>> 
>> What am I missing here? Is something terribly wrong with this test?
>> 
>> Thanks,
>> Tyler
>> 
>> On 11/17/15, 12:58 PM, "James Denton" <james.denton at rackspace.com> wrote:
>> 
>>> Hi Tyler,
>>> 
>>> You might try verifying that the instance properly received its IP
>>> address. You can try using ‘nova console-log <id>’ to view the console
>>> log of the instance. Look for the cloud-init info. Also, take a look at
>>> the syslog of the network node to see if the DHCP request made it and was
>>> acknowledged. If it looks like it got its IP, try hitting the instance
>>> from within the DHCP or router namespace to see if you can hit the fixed
>>> IP from something in the same network before trying to hit the floating
>>> IP. You may also want to run some packet captures on the respective qbr
>>> bridge and physical interfaces while doing these tests to see if/where
>>> traffic is getting dropped.
>>> 
>>> James
>>> 
>>>> On Nov 17, 2015, at 11:31 AM, Tyler Couto <tcouto at certain.com> wrote:
>>>> 
>>>> Thanks Andreas. My security groups do allow icmp traffic.
>>>> 
>>>> +---------+-------------------------------------------------------------
>>>> -
>>>> --
>>>> ------+
>>>> | name    | security_group_rules
>>>>     |
>>>> 
>>>> +---------+-------------------------------------------------------------
>>>> -
>>>> --
>>>> ------+
>>>> | default | egress, IPv4
>>>>     |
>>>> |         | egress, IPv6
>>>>     |
>>>> |         | ingress, IPv4, 22/tcp, remote_ip_prefix: 0.0.0.0/0
>>>>     |
>>>> |         | ingress, IPv4, icmp, remote_ip_prefix: 0.0.0.0/0
>>>>     |
>>>> |         | ingress, IPv4, remote_group_id:
>>>> d404679b-aeed-4d2f-bea9-2c7d19ff3fb1 |
>>>> |         | ingress, IPv6, remote_group_id:
>>>> d404679b-aeed-4d2f-bea9-2c7d19ff3fb1 |
>>>> +---------+‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹‹+
>>>> 
>>>> I can¹t access my VM¹s console, so I do not know whether I can ping
>>>> from
>>>> my VM. I figured this might be a related issue. I receive this error on
>>>> when trying to access the noVNC console:
>>>> Failed to connect to server (code: 1006)
>>>> 
>>>> 
>>>> This is a two node setup. I have one controller/neutron-network node.
>>>> Here¹s the output of 'ovs-vsctl show¹:
>>>> 
>>>> Bridge br-tun
>>>> fail_mode: secure
>>>>   Port "gre-ac10183d"
>>>>     Interface "gre-ac10183d"
>>>>       type: gre
>>>>       options: {df_default="true", in_key=flow,
>>>> local_ip="172.16.24.60",
>>>> out_key=flow, remote_ip="172.16.24.61"}
>>>>   Port gre-mirror
>>>>     Interface gre-mirror
>>>>   Port br-tun
>>>>     Interface br-tun
>>>>       type: internal
>>>>   Port patch-int
>>>>     Interface patch-int
>>>>       type: patch
>>>>       options: {peer=patch-tun}
>>>>   Bridge br-ex
>>>>   Port "enp4s0f0"
>>>>     Interface "enp4s0f0"
>>>>   Port phy-br-ex
>>>>     Interface phy-br-ex
>>>>       type: patch
>>>>       options: {peer=int-br-ex}
>>>>   Port br-ex
>>>>     Interface br-ex
>>>>       type: internal
>>>>   Port "enp4s0f1"
>>>>     Interface "enp4s0f1"
>>>>   Bridge br-int
>>>> fail_mode: secure
>>>>   Port "qr-a81f0614-0e"
>>>>     tag: 2
>>>>     Interface "qr-a81f0614-0e"
>>>>       type: internal
>>>>   Port "qg-289ea4d2-29"
>>>>     tag: 5
>>>>     Interface "qg-289ea4d2-29"
>>>>       type: internal
>>>>   Port br-int
>>>>     Interface br-int
>>>>       type: internal
>>>>   Port patch-tun
>>>>     Interface patch-tun
>>>>       type: patch
>>>>       options: {peer=patch-int}
>>>>   Port int-br-ex
>>>>     Interface int-br-ex
>>>>       type: patch
>>>>       options: {peer=phy-br-ex}
>>>>   Port "tap468d3ee4-c0"
>>>>     tag: 4095
>>>>     Interface "tap468d3ee4-c0"
>>>>       type: internal
>>>>   ovs_version: "2.3.1"
>>>> 
>>>> 
>>>> I have on compute node. Here¹s the output of 'ovs-vsctl show':
>>>> 
>>>> Bridge br-int
>>>> fail_mode: secure
>>>>   Port "qvoc6d01e4b-1d"
>>>>     tag: 1
>>>>     Interface "qvoc6d01e4b-1d"
>>>>   Port br-int
>>>>     Interface br-int
>>>>       type: internal
>>>>   Port patch-tun
>>>>     Interface patch-tun
>>>>       type: patch
>>>>       options: {peer=patch-int}
>>>> Bridge br-tun
>>>> fail_mode: secure
>>>>   Port br-tun
>>>>     Interface br-tun
>>>>       type: internal
>>>>   Port patch-int
>>>>     Interface patch-int
>>>>       type: patch
>>>>       options: {peer=patch-tun}
>>>>   Port "gre-ac10183c"
>>>>     Interface "gre-ac10183c"
>>>>       type: gre
>>>>       options: {df_default="true", in_key=flow,
>>>> local_ip="172.16.24.61",
>>>> out_key=flow, remote_ip="172.16.24.60"}
>>>>   Port gre-mirror
>>>>     Interface gre-mirror
>>>>   Port "tap0"
>>>>     Interface "tap0"
>>>>   ovs_version: "2.3.1"
>>>> 
>>>> 
>>>> I also have a laptop on the same network as the openstack machines. I
>>>> can
>>>> successfully ping the interface of the neutron router from my laptop.
>>>> 
>>>> As far as the physical interfaces, I am only using one physical
>>>> interface
>>>> on each openstack machine. I know this is not the recommended setup,
>>>> but
>>>> since this is only a POC, I wanted to keep it simple.
>>>> 
>>>> -Tyler
>>>> 
>>>> 
>>>> 
>>>> On 11/17/15, 12:48 AM, "Andreas Scheuring"
>>>> <scheuran at linux.vnet.ibm.com>
>>>> wrote:
>>>> 
>>>>> ease check your Security Groups first.
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Mailing list:
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>>> Post to     : openstack at lists.openstack.org
>>>> Unsubscribe :
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 455 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20151120/2365962d/attachment.sig>


More information about the Openstack mailing list