[Openstack-operators] Multiple floating IPs mapped to multiple vNICs (multi-homing)

Paul Browne pfb29 at cam.ac.uk
Thu Dec 1 15:23:41 UTC 2016


On 01/12/16 14:35, Saverio Proto wrote:

> Your policy routing looks good.
> The problem must be somewhere else, where you do the nat maybe ?
>
> Go in the network namespace where there is the neutron router with
> address 10.0.16.1
>
> If you tcpdump there what do you see ?
>
> to be 100% sure about the policy routing just go in the network node
> where you do the nat.
>
> ip netns exec qrouter-<uuid> wget -O /dev/nullhttp://10.0.16.11/
>
> uuid is the uuid of the neutron router where you are natting
>
> I guess this will work.

Yes, this does seem to work as expected, in both namespaces;

# Determine the controller hosting the router for 10.0.0.11
[stack at osp-director-prod ~]$ neutron l3-agent-list-hosting-router 
5f9d983c-3b51-4411-b921-1a523652d55f
+--------------------------------------+--------------------------+----------------+-------+----------+
| id                                   | host                     | 
admin_state_up | alive | ha_state |
+--------------------------------------+--------------------------+----------------+-------+----------+
| 37051003-636f-4fb7-b6d9-1ff9d9182e9d | clc-sby4f-n3.mgt.cluster | 
True           | :-)   | standby  |
| b14fcc29-67c0-4420-8abf-433dafde980d | clc-rb15-n1.mgt.cluster  | 
True           | :-)   | standby  |
| 37518a78-4b39-463a-8387-2866989bba06 | clc-ra15-n2.mgt.cluster  | 
True        | :-)   | active   |
+--------------------------------------+--------------------------+----------------+-------+----------+

# Change into the namespace and test the 10.0.0.11 web-server
[root at clc-ra15-n2 ~]# ip netns exec 
qrouter-5f9d983c-3b51-4411-b921-1a523652d55f wget -O /tmp/test 
http://10.0.0.11/; head -n 10 /tmp/test
--2016-12-01 14:54:09-- http://10.0.0.11/
Connecting to 10.0.0.11:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 10701 (10K) [text/html]
Saving to: ‘/tmp/test’
2016-12-01 14:54:09 (318 MB/s) - ‘/tmp/test’ saved [10701/10701]


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
   <head>
     <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
     <title>Apache2 Debian Default Page: It works</title>
     <style type="text/css" media="screen">
   * {
     margin: 0px 0px 0px 0px;
     padding: 0px 0px 0px 0px;


# Determine the controller hosting the router for 10.0.16.11
[stack at osp-director-prod ~]$ neutron l3-agent-list-hosting-router 
4fcdbc75-f53b-4833-902f-689258cd82ea
+--------------------------------------+--------------------------+----------------+-------+----------+
| id                                   | host                     | 
admin_state_up | alive | ha_state |
+--------------------------------------+--------------------------+----------------+-------+----------+
| 37051003-636f-4fb7-b6d9-1ff9d9182e9d | clc-sby4f-n3.mgt.cluster | 
True           | :-)   | standby  |
| b14fcc29-67c0-4420-8abf-433dafde980d | clc-rb15-n1.mgt.cluster  | 
True           | :-)   | standby  |
| 37518a78-4b39-463a-8387-2866989bba06 | clc-ra15-n2.mgt.cluster  | 
True           | :-)   | active   |
+--------------------------------------+--------------------------+----------------+-------+----------

# Change into the namespace and test the 10.0.16.11 web-server
[root at clc-ra15-n2 ~]# ip netns exec 
qrouter-4fcdbc75-f53b-4833-902f-689258cd82ea wget -O /tmp/test 
http://10.0.16.11/; head -n 10 /tmp/test
--2016-12-01 14:51:41-- http://10.0.16.11/
Connecting to 10.0.16.11:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 10701 (10K) [text/html]
Saving to: ‘/tmp/test’
  2016-12-01 14:51:41 (317 MB/s) - ‘/tmp/test’ saved [10701/10701]


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
   <head>
     <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
     <title>Apache2 Debian Default Page: It works</title>
     <style type="text/css" media="screen">
   * {
     margin: 0px 0px 0px 0px;
     padding: 0px 0px 0px 0px;



> Oh, did you double check the security groups ?

Oh yes, I've fallen for that one too many times! Both Neutron ports 
share the same security groups, to allow SSH (TCP 22), ICMP and HTTP 
(TCP 80)




I think I may have seen now where I was going wrong though. It's a 
peculiarity of our Red Hat OSP8 (so, Liberty) setup that we run VXLANs 
with a network_device_mtu of 1400, as the physical network is set 1500.

We should probably have set physical network MTU up to 1600 before 
deployment, allowing naive selection of 1500MTU by VXLAN connected 
vNICs, and may in future.

In this case;

* The first vNIC was getting a dnsmasq provided IP from DHCP agent, and 
so getting the 1400 MTU
* The second vNIC was instead taking a statically configured IP and so 
using the naive MTU of 1500, too large for the physical network when 
VXLAN overhead is added on top.

Once I revised the second vNIC to also take a DHCP IP, but keep the 
post-up routing rules on it, the web-server responds to remote clients 
on both interfaces as expected.

Many thanks for your help in chasing this down, it's given me some extra 
tools in the Neutron troubleshooting tool-box!

Kind regards,
Paul Browne





> Saverio
>
> 2016-12-01 15:18 GMT+01:00 Paul Browne<pfb29 at cam.ac.uk>:
>> Hello Saverio,
>>
>> Many thanks for the reply, I'll answer your queries below;
>>
>> On 01/12/16 12:49, Saverio Proto wrote:
>>> Hello,
>>>
>>> while the problem is in place, you should share the output of
>>>
>>> ip rule show
>>> ip route show table 1
>>>
>>> It could be just a problem in your ruleset
>> Of course, these are those outputs ;
>>
>> root at test1:~# ip rule show
>> 0:      from all lookup local
>> 32764:  from all to 10.0.16.11 lookup rt2
>> 32765:  from 10.0.16.11 lookup rt2
>> 32766:  from all lookup main
>> 32767:  from all lookup default
>>
>> root at test1:~# ip route show table 1
>> default via 10.0.16.1 dev eth1
>> 10.0.16.0/24 dev eth1  scope link  src 10.0.16.11
>>
>>
>>> and, which one is your webserver ? can you tcpdump to make sure reply
>>> packets get out on the NIC with src address 10.0.16.11 ?
>>>
>>> Saverio
>> The instance has its two vNICs with source addresses 10.0.0.11 & 10.0.16.11,
>> and the web-server is listening on both.
>>
>> The HTTP packets do seem to be getting out from 10.0.16.11 as source, but
>> are stopped elsewhere upstream.
>>
>> I've attached two pcaps showing HTTP reply packets, one from 10.0.0.11
>> (first vNIC; HTTP request and reply works to a remote client) and one from
>> 10.0.16.11 (second vNIC; HTTP request is sent, reply not received by remote
>> client). In the latter case, the server starts to make retransmissions to
>> the remote client.
>>
>> Kind regards,
>> Paul Browne
>>
>>
>>
>>
>>> 2016-12-01 13:08 GMT+01:00 Paul Browne<pfb29 at cam.ac.uk>:
>>>> Hello Operators,
>>>>
>>>> For reasons not yet amenable to persuasion otherwise, a customer of our
>>>> ML2+OVS classic implemented OpenStack would like to map two floating IPs
>>>> pulled from two separate external network floating IP pools, to two
>>>> different vNICs on his instances.
>>>>
>>>> The floating IP pools correspond to one pool routable from the external
>>>> Internet and another, RFC1918 pool routable from internal University
>>>> networks.
>>>>
>>>> The tenant private networks are arranged as two RFC1918 VXLANs, each with
>>>> a
>>>> router to one of the two external networks.
>>>>
>>>> 10.0.0.0/24 -> route to -> 128.232.226.0/23
>>>>
>>>> 10.0.16.0/24 -> route to -> 172.24.46.0/23
>>>>
>>>>
>>>> Mapping two floating IPs to instances isn't possible in Horizon, but is
>>>> possible from command-line. This doesn't immediately work, however, as
>>>> the
>>>> return traffic from the instance needs to be sent back through the
>>>> correct
>>>> router gateway interface and not the instance default gateway.
>>>>
>>>> I'd initially thought this would be possible by placing a second routing
>>>> table on the instances to handle the return traffic;
>>>>
>>>> debian at test1:/etc/iproute2$ less rt_tables
>>>> #
>>>> # reserved values
>>>> #
>>>> 255     local
>>>> 254     main
>>>> 253     default
>>>> 0       unspec
>>>> #
>>>> # local
>>>> #
>>>> #1      inr.ruhep
>>>> 1 rt2
>>>>
>>>> debian at test1:/etc/network$ less interfaces
>>>> # The loopback network interface
>>>> auto lo
>>>> iface lo inet loopback
>>>>
>>>> # The first vNIC, eth0
>>>> auto eth0
>>>> iface eth0 inet dhcp
>>>>
>>>> # The second vNIC, eth1
>>>> auto eth1
>>>> iface eth1 inet static
>>>>           address 10.0.16.11
>>>>           netmask 255.255.255.0
>>>>           post-up ip route add 10.0.16.0/24 dev eth1 src 10.0.16.11 table
>>>> rt2
>>>>           post-up ip route add default via 10.0.16.1 dev eth1 table rt2
>>>>           post-up ip rule add from 10.0.16.11/32 table rt2
>>>>           post-up ip rule add to 10.0.16.11/32 table rt2
>>>>
>>>> And this works well for SSH and ICMP, but curiously not for HTTP traffic.
>>>>
>>>>
>>>> Requests to a web-server listening on all vNICs are sent but replies not
>>>> received when the requests are sent to the second mapped floating IP
>>>> (HTTP
>>>> requests and replies work as expected when sent to the first mapped
>>>> floating
>>>> IP). The requests are logged in both cases however, so traffic is making
>>>> it
>>>> to the instance in both cases.
>>>>
>>>> I'd say this is clearly an unusual (and possibly un-natural) arrangement,
>>>> but I was wondering whether anyone else on Operators had come across a
>>>> similar situation in trying to map floating IPs from two different
>>>> external
>>>> networks to an instance?
>>>>
>>>> Kind regards,
>>>>
>>>> Paul Browne
>>>>
>>>> --
>>>> *******************
>>>> Paul Browne
>>>> Research Computing Platforms
>>>> University Information Services
>>>> Roger Needham Building
>>>> JJ Thompson Avenue
>>>> University of Cambridge
>>>> Cambridge
>>>> United Kingdom
>>>> E-Mail:pfb29 at cam.ac.uk
>>>> Tel: 0044-1223-46548
>>>> *******************
>>>>
>>>>
>>>> _______________________________________________
>>>> OpenStack-operators mailing list
>>>> OpenStack-operators at lists.openstack.org
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>>
>> --
>> *******************
>> Paul Browne
>> Research Computing Platforms
>> University Information Services
>> Roger Needham Building
>> JJ Thompson Avenue
>> University of Cambridge
>> Cambridge
>> United Kingdom
>> E-Mail:pfb29 at cam.ac.uk
>> Tel: 0044-1223-46548
>> *******************
>>

-- 
*******************
Paul Browne
Research Computing Platforms
University Information Services
Roger Needham Building
JJ Thompson Avenue
University of Cambridge
Cambridge
United Kingdom
E-Mail:pfb29 at cam.ac.uk
Tel: 0044-1223-46548
*******************

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20161201/877367c6/attachment.html>


More information about the OpenStack-operators mailing list