[openstack-dev] [octavia] enabling new topologies

Michael Johnson johnsomor at gmail.com
Thu Jun 2 20:06:53 UTC 2016


Hi Sergey,  Welcome to working on Octavia!

I'm not sure I fully understand your proposals, but I can give my
thoughts/opinion on the challenge for Active/Active.

In general I agree with Stephen.

The intention of using TaskFlow is to facilitate code reuse across
similar but different code flows.

For an Active/Active provisioning request I envision it as a new flow
that is loaded as opposed to the current standalone and Active/Standby
flow.  I would expect it would include many existing tasks (example,
plug_network) that may be required for the requested action.  This new
flow will likely include a number of concurrent sub-flows using these
existing tasks.

I do expect that the "distributor" will need to be a new "element".
Because the various stakeholders are considering implementing this
function in different ways, we agreed that an API and driver would be
developed for interactions with the distributor.  This should also
take into account that there may be some deployments where
distributors are not shared.

I still need to review the latest version of the Act/Act spec to
understand where that was left after my first round of comments and
our mid-cycle discussions.

Michael


On Wed, Jun 1, 2016 at 10:23 AM, Stephen Balukoff <stephen at balukoff.com> wrote:
> Hey Sergey--
>
> Apologies for the delay in my response. I'm still wrapping my head around
> your option 2 suggestion and the implications it might have for the code
> base moving forward. I think, though, that I'm against your option 2
> proposal and in favor of option 1 (which, yes, is more work initially) for
> the following reasons:
>
> A. We have a precedent in the code tree with how the stand-alone and
> active-standby topologies are currently being handled. Yes, this does entail
> various conditionals and branches in tasks and flows-- which is not really
> that ideal, as it means the controller worker needs to have more specific
> information on how topologies work than I think any of us would like, and
> this adds some rigidity to the implementation (meaning 3rd party vendors may
> have more trouble interfacing at that level)...  but it's actually "not that
> bad" in many ways, especially given we don't anticipate supporting a large
> or variable number of topologies. (stand-alone, active-standby,
> active-active... and then what? We've been doing this for a number of years
> and nobody has mentioned any radically new topologies they would like in
> their load balancing. Things like auto-scale are just a specific case of
> active-active).
>
> B. If anything Option 2 builds more less-obvious rigidity into the
> implementation than option 1. For example, it makes the assumption that the
> distributor is necessarily an amphora or service VM, whereas we have already
> heard that some will implement the distributor as a pure network routing
> function that isn't going to be managed the same way other amphorae are.
>
> C. Option 2 seems like it's going to have a lot more permutations that would
> need testing to ensure that code changes don't break existing / potentially
> supported functionality. Option 1 keeps the distributor and amphorae
> management code separate, which means tests should be more straight-forward,
> and any breaking changes which slip through potentially break less stuff.
> Make sense?
>
> Stephen
>
>
> On Sun, May 29, 2016 at 7:12 AM, Sergey Guenender <GUENEN at il.ibm.com> wrote:
>>
>> I'm working with the IBM team implementing the Active-Active N+1 topology
>> [1].
>>
>> I've been commissioned with the task to help integrate the code supporting
>> the new topology while a) making as few code changes and b) reusing as much
>> code as possible.
>>
>> To make sure the changes to existing code are future-proof, I'd like to
>> implement them outside AA N+1, submit them on their own and let the AA N+1
>> base itself on top of it.
>>
>> --TL;DR--
>>
>> what follows is a description of the challenges I'm facing and the way I
>> propose to solve them. Please skip down to the end of the email to see the
>> actual questions.
>>
>> --The details--
>>
>> I've been studying the code for a few weeks now to see where the best
>> places for minimal changes might be.
>>
>> Currently I see two options:
>>
>>    1. introduce a new kind of entity (the distributor) and make sure it's
>> being handled on any of the 6 levels of controller worker code (endpoint,
>> controller worker, *_flows, *_tasks, *_driver)
>>
>>    2. leave most of the code layers intact by building on the fact that
>> distributor will inherit most of the controller worker logic of amphora
>>
>>
>> In Active-Active topology, very much like in Active/StandBy:
>> * top level of distributors will have to run VRRP
>> * the distributors will have a Neutron port made on the VIP network
>> * the distributors' neutron ports on VIP network will need the same
>> security groups
>> * the amphorae facing the pool member networks still require
>>     * ports on the pool member networks
>>     * "peers" HAProxy configuration for real-time state exchange
>>     * VIP network connections with the right security groups
>>
>> The fact that existing topologies lack the notion of distributor and
>> inspecting the 30-or-so existing references to amphorae clusters, swayed me
>> towards the second option.
>>
>> The easiest way to make use of existing code seems to be by splitting
>> load-balancer's amphorae into three overlapping sets:
>> 1. The front-facing - those connected to the VIP network
>> 2. The back-facing - subset of front-facing amphorae, also connected to
>> the pool members' networks
>> 3. The VRRP-running - subset of front-facing amphorae, making sure the VIP
>> routing remains highly available
>>
>> At the code-changes level
>> * the three sets can be simply added as properties of
>> common.data_model.LoadBalancer
>> * the existing amphorae cluster references would switch to using one of
>> these properties, for example
>>     * the VRRP sub-flow would loop over only the VRRP amphorae
>>     * the network driver, when plugging the VIP, would loop over the
>> front-facing amphorae
>>     * when connecting to the pool members' networks,
>> network_tasks.CalculateDelta would only loop over the back-facing amphorae
>>
>> In terms of backwards compatibility, Active-StandBy topology would have
>> the 3 sets equal and contain both of its amphorae.
>>
>> An even more future-proof approach might be to implement the sets-getters
>> as selector methods, supporting operation on subsets of each kind of
>> amphorae. For instance when growing/shrinking back-facing amphorae cluster,
>> only the added/removed ones will need to be processed.
>>
>> Finally (thank you for your patience, dear reader), my question is: if any
>> of the above makes sense, and to facilitate the design/code review, what
>> would be the best way to move forward?
>>
>> Should I create a mini-blueprint describing the changes and implement it?
>> Should I just open a bug for it and supply a fix?
>>
>> Thanks,
>> -Sergey.
>>
>> [1] https://review.openstack.org/#/c/234639
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



More information about the OpenStack-dev mailing list