[openstack-dev] [Neutron][LBaaS] Fulfilling Operator Requirements: Driver / Management API

Stephen Balukoff sbalukoff at bluebox.net
Fri May 2 02:43:37 UTC 2014


Hi Adam,

Thank you very much for starting this discussion!  In answer do your
questions from my perspective:

1. I think that it makes sense to start at least one new driver that
focuses on running software virtual appliances on Nova nodes (the NovaHA
you referred to above). The existing haproxy driver should not go away as I
think it solves problems for small to medium size deployments, and does
well for setting up, for example, a 'development' or 'QA' load balancer
that won't need to scale, but needs to duplicate much of the functionality
of the production load balancer(s).

On this note, we may want to actually create several different drivers
depending on the appliance model that operators are using. From the
discussion about HA that I started a couple weeks ago, it sounds like HP is
using an HA model that concentrates on pulling additional instances from a
waiting pool. The stingray solution you're using sounds like "raid 5"
redundancy for load balancing. And what we've been using is more like "raid
1" redundancy.

It probably makes sense to collaborate on a new driver and model if we
agree on the topologies we want to support at our individual organizations.
Even if we can't agree on this, it still makes sense for us to collaborate
on determining that "basic set of operator features" that all drivers
should support, from an operator perspective.

I think a management API is necessary--  operators and their support
personnel need to be able to troubleshoot problems down to the device
level, and I think it makes sense to do this through an OpenStack interface
if possible. In order to accommodate each vendor's differences here,
though, this may only be possible if we allow for different drivers to
expose "operator controls" in their own way.

I do not think any of this should be exposed to the user API we have been
discussing.

I think it's going to be important to come to some kind of agreement on the
user API and object model changes before it's going to be possible to start
to really talk about how to do the management API.

I am completely on board with this! As I have said in a couple other places
on this list, Blue Box actually wrote our own software appliance based load
balancing system based on HAProxy, stunnel, corosync/pacemaker, and a
series of glue scripts (mostly written in perl, ruby, and shell) that
provide a "back-end API" and whatnot. We've actually done this (almost)
from scratch twice now, and have plans and some work underway to do it a
third time-- this time to be compatible with OpenStack (and specifically
the Neutron LBaaS API, hopefully as a driver for the same). This will be
completely open source, and hopefully compliant with OpenStack standards
(equivalent licensing, everything written in python, etc.)  So far, I've
only had time to port over the back-end API and a couple design docs, but
if you want to see what we have in mind, here's the documentation on this
so far:

https://github.com/blueboxgroup/octavia/

In particular, probably the theory of operation document will give you the
best overview of how it works:

https://github.com/blueboxgroup/octavia/blob/master/doc/theory-of-operation.md

And the virtual appliance API (as it was two months ago. Some things will
definitely change based on discussions of the last couple months):
https://github.com/blueboxgroup/octavia/blob/master/doc/virtual-appliance-api.md

Thanks,
Stephen



On Thu, May 1, 2014 at 2:33 PM, Adam Harwell <adam.harwell at rackspace.com>wrote:

>  I am sending this now to gauge interest and get feedback on what I see
> as an impending necessity — updating the existing "haproxy" driver,
> replacing it, or both. Though we're not there yet, it is probably best to
> at least start the discussion now, to hopefully limit some fragmentation
> that may be starting around this concept already.
>
>  To begin with, I should probably define some terms. Following is a list
> of the major things I'll be referencing and what I mean by them, since I
> would like to avoid ambiguity as much as possible.
>
>  ----------------------------------
> ---- Glossary
> ----------------------------------
> *HAProxy*: This references two things currently, and I feel this is a
> source of some misunderstanding. When I refer to  HAProxy (capitalized), I
> will be referring to the official software package (found here:
> http://haproxy.1wt.eu/ ), and when I refer to "haproxy" (lowercase, and
> in quotes) I will be referring to the neutron-lbaas driver (found here:
> https://github.com/openstack/neutron/tree/master/neutron/services/loadbalancer/drivers/haproxy ).
> The fact that the neutron-lbaas driver is named directly after the software
> package seems very unfortunate, and while it is not directly in the scope
> of what I'd like to discuss here, I would love to see it changed to more
> accurately reflect what it is --  one specific driver implementation that
> coincidentally uses HAProxy as a backend. More on this later.
>
>  *Operator Requirements*: The requirements that can be found on the wiki
> page here:
> https://wiki.openstack.org/wiki/Neutron/LBaaS/requirements#Operator_Requirements and
> focusing on (but not limited to) the following list:
> * Scalability
> * DDoS Mitigation
> * Diagnostics
> * Logging and Alerting
> * Recoverability
> * High Availability (this is in the User Requirements section, but will be
> largely up to the operator to handle, so I would include it when discussing
> Operator Requirements)
>
>  *Management API*: A restricted API containing resources that Cloud
> Operators could access, including most of the list of Operator Requirements
> (above).
>
>  *Load Balancer (LB)*: I use this term very generically — essentially a
> logical entity that represents one "use case". As used in the sentence: "I
> have a Load Balancer in front of my website." or "The Load Balancer I set
> up to offload SSL Decryption is lowering my CPU load nicely."
>
>  ----------------------------------
> ---- Overview
> ----------------------------------
>  What we've all been discussing for the past month or two (the API,
> Object Model, etc) is being directly driven by the User and Operator
> Requirements that have somewhat recently been enumerated (many thanks to
> everyone who has contributed to that discussion!). With that in mind, it is
> hopefully apparent that the current API proposals don't directly address
> many (or really, any) of the Operator requirements! Where in either of our
> API proposals are logging, high availability, scalability, DDoS mitigation,
> etc? I believe the answer is that none of these things can possibly be
> handled by the API, but are really implementation details at the driver
> level. Radware, NetScaler, Stingray, F5 and HAProxy of any flavour would
> all have very different ways of handling these things (these are just some
> of the possible backends I can think of). At the end of the day, what we
> really have are the requirements for a driver, which may or may not use
> HAProxy, that we hope will satisfy all of our concerns. That said, we may
> also want to have some form of "Management API" to expose these features in
> a common way.
>
>  In this case, we really need to discuss two things:
>
>    1. Whether to update the existing "haproxy" driver to accommodate
>    these Operator Requirements, or whether to start from scratch with a new
>    driver (possibly both).
>    2. How to expose these Operator features at the (Management?) API
>    level.
>
> ----------------------------------
> ---- 1) Driver
> ----------------------------------
> I believe the current "haproxy" driver serves a very specific purpose, and
> while it will need some incremental updates, it would be in the best
> interest of the community to also create and maintain a new driver (which
> it sounds like several groups have already begun work on — ack!) that could
> support a different approach. For instance, the current "haproxy" driver is
> implemented by initializing HAProxy processes on a set of shared hosts,
> whereas there has been some momentum behind creating individual Virtual
> Machines (via Nova) for each Load Balancer created, similar to Libra's
> approach. Alternatively, we could use LXC or a similar technology to more
> effectively isolate LBs and assuage concerns about tenant cross-talk (real
> or imaginary, this has been an issue for some customers). Either way, we'd
> probably need a brand new driver, to avoid breaking backwards compatibility
> with the existing driver (which does work perfectly fine in many cases). In
> fact, it's possible that when we begin discussing this as a broader
> community, we might decide to create more than one additional driver
> (depending on which approaches people want to use and what features are
> most important to them). The only concern I have about that outcome is the
> necessary amount of code-reuse, and whether it would be possible to share
> certain aspects of these drivers without too much copy/pasting.
>
>  An example of one possible new driver could be the following (just off
> the top of my head):
> * Use a pair of new Nova VMs for each LB (Scalability), configured to use
> a Shared IP (High Availability).
> * Log to Swift / Ceilometer (Logging / Alerting / Metering).
> * Provide calls that could be exposed via a Management API to show low
> level diagnostic details (Diagnostics).
> * Provide calls that could be exposed via a Management API to allow
> syncing/reloading existing LBs or moving them across clusters
> (Recoverability, DDoS Mitigation).
> This new driver would be named to reflect what features it provides, or at
> least given a unique name that can be referenced without confusion
> (something like "OpenHA" or "NovaHA" would work if that's not taken).
>
>  ----------------------------------
> ---- 2) Management API
> ----------------------------------
>  Going forward, it should then be required (can we enforce this?) that
> any mainline driver include support for calls to handle these named
> Operator Requirements, for example: obtaining logs (or log locations?),
> diagnostic information, and admin type actions including rebuilding or
> migrating LB instances. So far we haven't really talked about any of these
> features in depth, though I believe the general need for a Management API
> was alluded to on several occasions. Should we shelve this discussion until
> after we have the User API specification locked down? Should we begin
> defining a contract for this Management API at the summit, since it would
> be the main gateway to the Operator Requirements that we have all been
> stressing recently?
>
>  ----------------------------------
> ---- Summary
> ----------------------------------
>  I would apologize for not having much concrete specification here, but I
> think it is better to validate my basic assumptions first, before jumping
> deeper down this rabbit hole. The type of comments I'm hoping to prompt are
> along the lines of:
> * "We should just focus on the existing haproxy driver."
> * "We should definitely collaborate to make a new driver as a community."
> * "I don't think a Management API is necessary."
> * "This is definitely what I was thinking we'd need to do."
>  Anything specific implementation details I've mentioned are intended be
> taken as one possible example, not as a well thought out proposal. I am, as
> one might say, "speaking my mind". My hope is that some of this will simmer
> on the general subconscious. I'd like to hear what the general consensus is
> on these topics, because these are some of the assumptions I've been
> operating under during the rest of our discussions, and if they're invalid,
> I may need to rebase my view on the API discussion as a whole.
>
>  Thanks ya'll, I'm looking forward to getting some additional viewpoints!
> --Adam Harwell (rm_work)
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>


-- 
Stephen Balukoff
Blue Box Group, LLC
(800)613-4305 x807
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140501/2b86c76d/attachment.html>


More information about the OpenStack-dev mailing list