[Openstack-operators] Floating IPs failing in dvr_snat mode with Mitaka
Jonathan Mills
jonmills at gmail.com
Wed Aug 10 19:18:32 UTC 2016
Hi all,
I’m running Mitaka on CentOS 7.2 with Neutron in dvr_snat mode.
# uname -msr
Linux 3.10.0-327.22.2.el7.x86_64 x86_64
I’m using vlans, not vxlans, but I don’t think that matters either way. So
basically, I have one NIC “eth2” which is in vlan trunk mode, and on my
switch side, I have every neutron-defined vlan trunked there. Whether it’s
a tenant network vlan, or an external vlan for floating IPs, it all comes
back to that same NIC.
So here’s a compute node “node1”. It has a successfully booted VM, which
has fixed IP 10.97.8.103 and floating IP 10.96.8.107. As seen from the
compute node:
# ip netns
fip-cbe55dc5-c4e4-4ec0-aa52-b4713f1279ee
qrouter-efc60192-97ad-49ef-bab7-cda42ca6bc29
snat-efc60192-97ad-49ef-bab7-cda42ca6bc29
# ip netns exec fip-cbe55dc5-c4e4-4ec0-aa52-b4713f1279ee ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: fpr-efc60192-9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
pfifo_fast state UP qlen 1000
link/ether 32:06:67:df:53:c6 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 169.254.109.47/31 scope global fpr-efc60192-9
valid_lft forever preferred_lft forever
inet6 fe80::3006:67ff:fedf:53c6/64 scope link
valid_lft forever preferred_lft forever
19: fg-152dc56a-c1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
noqueue state UNKNOWN
link/ether fa:16:3e:40:9f:5b brd ff:ff:ff:ff:ff:ff
inet 10.96.8.101/23 brd 10.96.9.255 scope global fg-152dc56a-c1
valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:fe40:9f5b/64 scope link
valid_lft forever preferred_lft forever
# ip netns exec qrouter-efc60192-97ad-49ef-bab7-cda42ca6bc29 ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: rfp-efc60192-9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
pfifo_fast state UP qlen 1000
link/ether 72:49:e7:78:48:5d brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 169.254.109.46/31 scope global rfp-efc60192-9
valid_lft forever preferred_lft forever
inet 10.96.8.107/32 brd 10.96.8.107 scope global rfp-efc60192-9
valid_lft forever preferred_lft forever
inet6 fe80::7049:e7ff:fe78:485d/64 scope link
valid_lft forever preferred_lft forever
17: qr-ffc302ba-82: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
noqueue state UNKNOWN
link/ether fa:16:3e:8d:7c:62 brd ff:ff:ff:ff:ff:ff
inet 10.97.8.1/23 brd 10.97.9.255 scope global qr-ffc302ba-82
valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:fe8d:7c62/64 scope link
valid_lft forever preferred_lft forever
So you can see that I have both the ‘fpr’ and ‘rfp’ namespaces, which is a
good indicator I didn’t totally flub the dvr_snat neutron config. From
within either namespace, I can ping the floating IP 10.96.8.107, which
makes sense. However, for the floating IP to be useful, it would need to
be generally reachable by any other system in its designated vlan, and that
is not the case. In my real-world use case, I would be running the vlan of
this floating IP network back over to my bastion host, to allow users to
ssh into their VMs via the floating IP. I can’t reach the floating IPs
though from anywhere outside the namespace on the compute node.
One more clue, in the l3-agent log on the compute node in question:
2016-08-03 11:14:09.665 6041 ERROR neutron.agent.linux.ip_lib [-] Failed
sending gratuitous ARP to 10.96.8.107 on fg-152dc56a-c1 in namespace
fip-cbe55dc5-c4e4-4ec0-aa52-b4713f1279ee
2016-08-03 11:14:09.665 6041 ERROR neutron.agent.linux.ip_lib Traceback
(most recent call last):
2016-08-03 11:14:09.665 6041 ERROR neutron.agent.linux.ip_lib File
"/usr/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line
1040, in _arping
2016-08-03 11:14:09.665 6041 ERROR neutron.agent.linux.ip_lib
ip_wrapper.netns.execute(arping_cmd, check_exit_code=True)
2016-08-03 11:14:09.665 6041 ERROR neutron.agent.linux.ip_lib File
"/usr/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 927,
in execute
2016-08-03 11:14:09.665 6041 ERROR neutron.agent.linux.ip_lib
log_fail_as_error=log_fail_as_error, **kwargs)
2016-08-03 11:14:09.665 6041 ERROR neutron.agent.linux.ip_lib File
"/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 140,
in execute
2016-08-03 11:14:09.665 6041 ERROR neutron.agent.linux.ip_lib raise
RuntimeError(msg)
2016-08-03 11:14:09.665 6041 ERROR neutron.agent.linux.ip_lib RuntimeError:
Exit code: 2; Stdin: ; Stdout: ; Stderr: bind: Cannot assign requested
address
After a little Googling, I think I may be seeing the same behavior as this
user:
https://bugs.centos.org/view.php?id=11238
I’m reaching out to see if anyone else has witnessed this, or has any sage
advice for me.
Jonathan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20160810/ada5918e/attachment.html>
More information about the OpenStack-operators
mailing list