[Openstack-operators] OpenStack node redundancy

Dan Sneddon dsneddon at redhat.com
Tue Dec 22 01:45:40 UTC 2015


On 12/18/2015 09:16 AM, Serguei Bezverkhi (sbezverk) wrote:
> Hello team,
> 
>  
> 
> I would appreciate if you could share approach you use for OpenStack
> node redundancy from network failure perspective. Example in case of a
> failure of an upstream switch and if a node does not have redundant
> link to a second upstream switch, it gets isolated. I understand that
> OpenStack will deal with this situation by removing this node, but in
> my case there is a strict requirement to be able to prevent a single
> switch failure.
> 
>  
> 
> Here is the solution I came up with, it is nothing new but appreciate
> your critic/comments/suggestions.
> 
>  
> 
> Redundant OpenStack is connected to two upstream switches, interfaces
> are teamed into group and bound to respective OpenStack bridges. So
> br-int would have physical link interface team0 and not eth0 or eth1
> connected. In this case if one of the upstream links fails, the
> connectivity would be preserved.
> 
>  
> 
> Thank you
> 
>  
> 
> Serguei

The issue is: how do you connect a bond to two different switches? Not
all bonds work when connected to multiple switches. There are three
approaches that I have used to create network link redundancy between
switches:

1) Virtual-chassis clustering of the switches, with Ethernet bonds
split between the physical chassis. In this case, it is functionally
equivalent to using a single switch. LACP may be used with either the
Linux bonding module or OVS. I used to recommend using OVS with LACP
bonds, but OVS seems to drop ARP and multicast packets under load. Now
I recommend Linux bonding in conjunction with LACP on the switches.

2) Use OVS bonding in "balance-slb" mode. In this mode, OVS will assign
each VLAN that is trunked across the bond to a single bond member.
Outbound traffic from a VLAN will only use a single member of the bond.
The VLANs will be rebalanced according to the
other-config:bond-rebalance-interval settings (set to zero to disable
rebalancing).

3) Use active-backup bonds with the Linux bonding module or OVS. This
won't give you load-balancing, but this is probably the most robust
solution if uptime is more important than throughput.

When TripleO configures bonded interfaces, the os-net-config [1]
utility is used to create bonding configuration files. The bond types
currently supported are OVS bonds and Linux bonds, but I am going to be
adding NIC teaming support in the near future. I think that NIC teaming
will have some advantages over bonding, but I haven't heard any reports
of anyone using NIC teaming in production on a large scale yet. If you
end up going the teaming route, can you post your final configuration
back here for the benefit of the group?

[1] - https://github.com/openstack/os-net-config

-- 
Dan Sneddon         |  Principal OpenStack Engineer
dsneddon at redhat.com |  redhat.com/openstack
650.254.4025        |  dsneddon:irc   @dxs:twitter



More information about the OpenStack-operators mailing list