[openstack-dev] [octavia] enabling new topologies

Sergey Guenender sgserg at gmail.com
Sun Jun 5 15:07:44 UTC 2016


Hi Stephen, please find my reply next to your points below.

Thank you,
-Sergey.


On 01/06/2016 20:23, Stephen Balukoff wrote:
 > Hey Sergey--
 >
 > Apologies for the delay in my response. I'm still wrapping my head
 > around your option 2 suggestion and the implications it might have for
 > the code base moving forward. I think, though, that I'm against your
 > option 2 proposal and in favor of option 1 (which, yes, is more work
 > initially) for the following reasons:
 >
 > A. We have a precedent in the code tree with how the stand-alone and
 > active-standby topologies are currently being handled. Yes, this does
 > entail various conditionals and branches in tasks and flows-- which is
 > not really that ideal, as it means the controller worker needs to have
 > more specific information on how topologies work than I think any of us
 > would like, and this adds some rigidity to the implementation (meaning
 > 3rd party vendors may have more trouble interfacing at that level)...
 > but it's actually "not that bad" in many ways, especially given we don't
 > anticipate supporting a large or variable number of topologies.
 > (stand-alone, active-standby, active-active... and then what? We've been
 > doing this for a number of years and nobody has mentioned any radically
 > new topologies they would like in their load balancing. Things like
 > auto-scale are just a specific case of active-active).

Just as you say, two topologies are being handled as of now by only one 
set of flows. Option two goes along the same lines, instead of adding 
new flows for active-active it suggests that minor adjustments to 
existing flows can also satisfy active-active.

 > B. If anything Option 2 builds more less-obvious rigidity into the
 > implementation than option 1. For example, it makes the assumption that
 > the distributor is necessarily an amphora or service VM, whereas we have
 > already heard that some will implement the distributor as a pure network
 > routing function that isn't going to be managed the same way other
 > amphorae are.

This is a good point. By looking at the code, I see there are comments 
mentioning the intent to share amphora between several load balancers. 
Although probably not straightforward to implement, it might be a good 
idea one day, but the fact is it looks like amphora has not been shared 
between load balancers for a few years.

Personally, when developing something complex, I believe in taking baby 
steps. If the virtual, non-shared distributor (which is promised by the 
AA blueprint anyway) is the smallest step towards a working 
active-active, then I guess it should be considered taking first.

Unless of course, it precludes implementing the following, more complex 
topologies.

My belief is it doesn't have to. The proposed change alone (splitting 
amphorae into sub-clusters to be used by the many for-loops) doesn't 
force any special direction on its own. Any future topology may leave 
its "front-facing amphorae" set equal to its "back-facing amphorae" 
which brings it back to the current style of for-loops handling.

 > C. Option 2 seems like it's going to have a lot more permutations that
 > would need testing to ensure that code changes don't break existing /
 > potentially supported functionality. Option 1 keeps the distributor and
 > amphorae management code separate, which means tests should be more
 > straight-forward, and any breaking changes which slip through
 > potentially break less stuff. Make sense?

It certainly does.

My intent is that the simplest active-active implementation promised by 
the blueprint can be achieved with only minor changes by existing code. 
If required changes are not in fact small, or if this simplistic 
approach in some way impedes future work, we can drop this option.


 > Stephen
 >
 >
 > On Sun, May 29, 2016 at 7:12 AM, Sergey Guenender
 > <GUENEN at il.ibm.com
 > <mailto:GUENEN at il.ibm.com>> wrote:
 >
 >     I'm working with the IBM team implementing the Active-Active N+1
 >     topology [1].
 >
 >     I've been commissioned with the task to help integrate the code
 >     supporting the new topology while a) making as few code changes and
 >     b) reusing as much code as possible.
 >
 >     To make sure the changes to existing code are future-proof, I'd like
 >     to implement them outside AA N+1, submit them on their own and let
 >     the AA N+1 base itself on top of it.
 >
 >     --TL;DR--
 >
 >     what follows is a description of the challenges I'm facing and the
 >     way I propose to solve them. Please skip down to the end of the
 >     email to see the actual questions.
 >
 >     --The details--
 >
 >     I've been studying the code for a few weeks now to see where the
 >     best places for minimal changes might be.
 >
 >     Currently I see two options:
 >
 >         1. introduce a new kind of entity (the distributor) and make
 >     sure it's being handled on any of the 6 levels of controller worker
 >     code (endpoint, controller worker, *_flows, *_tasks, *_driver)
 >
 >         2. leave most of the code layers intact by building on the fact
 >     that distributor will inherit most of the controller worker logic of
 >     amphora
 >
 >
 >     In Active-Active topology, very much like in Active/StandBy:
 >     * top level of distributors will have to run VRRP
 >     * the distributors will have a Neutron port made on the VIP network
 >     * the distributors' neutron ports on VIP network will need the same
 >     security groups
 >     * the amphorae facing the pool member networks still require
 >          * ports on the pool member networks
 >          * "peers" HAProxy configuration for real-time state exchange
 >          * VIP network connections with the right security groups
 >
 >     The fact that existing topologies lack the notion of distributor and
 >     inspecting the 30-or-so existing references to amphorae clusters,
 >     swayed me towards the second option.
 >
 >     The easiest way to make use of existing code seems to be by
 >     splitting load-balancer's amphorae into three overlapping sets:
 >     1. The front-facing - those connected to the VIP network
 >     2. The back-facing - subset of front-facing amphorae, also connected
 >     to the pool members' networks
 >     3. The VRRP-running - subset of front-facing amphorae, making sure
 >     the VIP routing remains highly available
 >
 >     At the code-changes level
 >     * the three sets can be simply added as properties of
 >     common.data_model.LoadBalancer
 >     * the existing amphorae cluster references would switch to using one
 >     of these properties, for example
 >          * the VRRP sub-flow would loop over only the VRRP amphorae
 >          * the network driver, when plugging the VIP, would loop over
 >     the front-facing amphorae
 >          * when connecting to the pool members' networks,
 >     network_tasks.CalculateDelta would only loop over the back-facing
 >     amphorae
 >
 >   *
 >
 >     In terms of backwards compatibility, Active-StandBy topology would
 >     have the 3 sets equal and contain both of its amphorae.
 >
 >     An even more future-proof approach might be to implement the
 >     sets-getters as selector methods, supporting operation on subsets of
 >     each kind of amphorae. For instance when growing/shrinking
 >     back-facing amphorae cluster, only the added/removed ones will need
 >     to be processed.
 >
 >     Finally (thank you for your patience, dear reader), my question is:
 >     if any of the above makes sense, and to facilitate the design/code
 >     review, what would be the best way to move forward?
 >
 >     Should I create a mini-blueprint describing the changes and
 >     implement it?
 >     Should I just open a bug for it and supply a fix?
 >
 >     Thanks,
 >     -Sergey.





More information about the OpenStack-dev mailing list