[openstack-dev] [Octavia] Question about where to render haproxy configurations
Brandon Logan
brandon.logan at RACKSPACE.COM
Sun Sep 7 07:21:54 UTC 2014
Hi German,
Comments in-line
On Sun, 2014-09-07 at 04:49 +0000, Eichberger, German wrote:
> Hi Steven,
>
>
>
> Thanks for taking the time to lay out the components clearly. I think
> we are pretty much on the same pageJ
>
>
>
> Driver vs, Driver-less
>
> I strongly believe that REST is a cleaner interface/integration point
> – but if even Brandon believes that drivers are the better approach
> (having suffered through the LBaaS v1 driver world which is not an
> advertisement for this approach) I will concede on that front. Let’s
> hope nobody makes an asynchronous driver and/or writes straight to the
> DBJ That said I still believe that adding the driver interface now
> will lead to some more complexity and I am not sure we will get the
> interface right in the first version: so let’s agree to develop with a
> driver in mind but don’t allow third party drivers before the
> interface has matured. I think that is something we already sort of
> agreed to, but I just want to make that explicit.
I think the LBaaS V1/V2 driver approach works well enough. The problems
that arose from it were because most entities were root level objects
and thus had some independent properties to them. For example, a pool
can exist without a listener, and a listener can exist without a load
balancer. The load balancer was the entity tied to the driver. For
Octavia, we've already agreed that everything will be a direct or
indirect child of a load balancer so this should not be an issue.
I agree with you that we will not get the interface right the first
time. I hope no one was planning on starting another driver other than
haproxy anytime before 1.0 because I vaguely remember 2.0 being the time
that multiple drivers can be used. By that time the interface should be
in a good shape.
>
>
> Multiple drivers/version for the same Controller
>
> This is a really contentious point for us at HP: If we allow say
> drivers or even different versions of the same driver, e.g. A, B, C to
> run in parallel, testing will involve to test all the possible
> (version) combination to avoid potential side effects. That can get
> extensive really quick. So HP is proposing, given that we will have
> 100s of controllers any way, to limit the number of drivers per
> controller to 1 to aide testing. We can revisit that at a future time
> when our testing capabilities have improved but for now I believe we
> should choose that to speed things up. I personally don’t see the need
> for multiple drivers per controller – in an operator grade environment
> we likely don’t need to “save” on the number of controllers ;-) The
> only reason we might need two drivers on the same controller is if an
> Amphora for whatever reason needs to be talked to by two drivers.
> (e.g. you install nginx and haproxy and have a driver for each). This
> use case scares me so we should not allow it.
>
> We also see some operational simplifications from supporting only one
> driver per controller: If we have an update for driver A we don’t need
> to touch any controller running Driver B. Furthermore we can keep the
> old version running but make sure no new Amphora gets scheduled there
> to let it wind down with attrition and then stop that controller when
> it doesn’t have any more Amphora to serve.
I also agree that we should, for now, only allow 1 driver at a time and
revisit it after we've got a solid grasp on everything. I honestly
don't think we will have multiple drivers for a while anyway, so by the
time we have a solid grasp on it we will know the complexities it will
introduce and thus make it a permanent rule or implement it.
I do recognize your worry about the many permutations that could arise
from having a controller driver version and an amphora version. I might
be short-sighted or just blind to it, but would you be testing an nginx
controller driver against an haproxy amphora? That shouldn't work, and
thus I don't see why you would want to test that. So the only other
option is testing (as an example) an haproxy 1.5 controller driver with
amphorae that may have different versions of code, scripts, ancillary
applications, and/or haproxy. So its possible there could be N number
of amphorae running haproxy 1.5, if you are planning on keeping older
versions around. You would need to test the haproxy 1.5 controller
driver against N amphorae versions. Obviously if we are allowing
multiple versions of haproxy controller drivers, then we'd have to test
N controller drivers against N amphorae versions. Correct me if I am
interpreting your versioning issue incorrectly.
I see this being a potential issue. However, right now at least, I
think the benefit of not having to update all amphorae in a deployment
if we need to make a simple config rendering change outweighs this
potential issue. I feel like we will be doing a lot more of those
changes rather than adding new haproxy version controller drivers. Even
when a new haproxy version is released, what are the odds that we would
want to use it if what we already have works for what we need? If what
we already have doesn't, then we'd probably not use an old and busted
controller and an old and busted amphora version.
That's my take on it currently. It is subject to change of course.
>
>
> Lastly, I interpreted the word “VM driver” in the spec along the lines
> what we have in libra: A driver interface on the Amphora agent that
> abstracts starting/stopping the haproxy if we end up on some different
> and abstracts writing the haproxy file. But that is for the agent on
> the Amphora. I am sorry I got confused that way when reading the 0.5
> spec and I am therefore happy we can have that discussion to make
> things more clear.
I'm sure more things will come up that we've all made assumptions on and
while reading the specs we read what we thought was what it said, but
actually didn't.
>
>
> German
>
>
>
> From: Stephen Balukoff [mailto:sbalukoff at bluebox.net]
> Sent: Friday, September 05, 2014 6:26 PM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [Octavia] Question about where to render
> haproxy configurations
>
>
>
>
> Hi German,
>
>
>
>
> Responses in-line:
>
>
>
>
> On Fri, Sep 5, 2014 at 2:31 PM, Eichberger, German
> <german.eichberger at hp.com> wrote:
>
> Hi Stephen,
>
>
>
> I think this is a good discussion to have and will make it more clear
> why we chose a specific design. I also believe by having this
> discussion we will make the design stronger. I am still a little bit
> confused what the driver/controller/amphora agent roles are. In my
> driver-less design we don’t have to worry about the driver which most
> likely in haproxy’s case will be split to some degree between
> controller and amphora device.
>
>
>
>
>
> Yep, I agree that a good technical debate like this can help both to
> get many people's points of view and can help determine the technical
> merit of one design over another. I appreciate your vigorous
> participation in this process. :)
>
>
>
>
>
> So, the purpose of the controller / driver / amphora and the
> responsibilities they have are somewhat laid out in the Octavia v0.5
> component design document, but it's also possible that there weren't
> enough specifics in that document to answer the concerns brought up in
> this thread. So, to that end in my mind, I see things like the
> following:
>
>
>
>
>
> The controller:
>
>
> * Is responsible for concerns of the Octavia system as a whole,
> including the intelligence around interfacing with the networking,
> virtualization, and other layers necessary to set up the amphorae on
> the network and getting them configured.
>
>
> * Will rarely, if ever, talk directly to the end-systems or -services
> (like Neutron, Nova, etc.). Instead it goes through a "clean" driver
> interface for each of these.
>
>
> * The controller has direct access to the database where state is
> stored.
>
>
> * Must load at least one driver, may load several drivers and choose
> between them based on configuration logic (ex. flavors, config file,
> etc.)
>
>
>
>
>
> The driver:
>
>
> * Handles all communication to or from the amphorae
>
>
> * Is loaded by the controller (ie. its interface with the controller
> is a base class, associated methods, etc. It's objects and code, not a
> RESTful API.)
>
>
> * Speaks amphora-specific protocols on the back-end. In the case of
> the reference "haproxy" amphora, this will most likely be in the form
> of a RESTful API with an agent on the amp, as well as (probably)
> HMAC-signed UDP health, status and stats messages from the amp to the
> driver.
>
>
>
>
>
> The amphora:
>
>
> * Does the actual load balancing
>
>
> * Is managed by the controller through the driver.
>
>
> * Should be as "dumb" as possible.
>
>
> * Comes in different types, based on the software in the amphora
> image. (Though all amps of a given type should be managed by the same
> driver.) Types might include "haproxy," "nginx," "haproxy + nginx,"
> "3rd party vendor X," etc.
>
>
> * Should never have direct access to the Octavia database, and
> therefore attempt to be as stateless as possible, as far as
> configuration is concerned.
>
>
>
>
>
> To be honest, our current product does not have a "driver" layer per
> se, since we only interface with one type of back-end. However, we
> still render our haproxy configs in the controller. :)
>
>
>
>
>
>
>
> So let’s try to sum up what we want a controller to do:
>
> - Provision new amphora devices
>
> - Monitor/Manage health
>
> - Gather stats
>
> - Manage/Perform configuration changes
>
>
>
> The driver as described would be:
>
> - Render configuration changes in a specific format,
> e.g. haproxy
>
>
>
> Amphora Device:
>
> - Communicate with the driver/controller to make
> things happen
>
>
>
> So as Doug pointed out I can make a very thin driver which
> basically passes everything through to the Amphora Device or
> on the other hand of the spectrum I can make a very thick
> driver which manages all aspects from the amphora life cycle
> to whatever (aka kitchen sink). I know we are going for
> uttermost flexibility but I believe:
>
>
>
>
>
> So, I'm not sure it's fair to characterize the driver I'm suggesting
> as "very thick." If you get right down to it, I'm pretty sure the only
> major thing we disagree on here is where the haproxy configuration is
> rendered: Just before it's sent over the wire to the amphora, or just
> after it's JSON-equivalent is received over the wire from the
> controller.
>
>
>
>
>
> - With building an haproxy centric controller we don’t
> really know which things should be controller/which thing
> should be driver. So my shortcut is not to build a driver at
> all J
>
>
> So, I've become more convinced that having a driver layer there is
> going to be important if we want to support 3rd party vendors creating
> their own amphorae at all (which I think we do). It's also going to be
> important if we want to be able to support other versions of
> open-source amphorae (or experimental versions prior to pushing out to
> a wider user-base, etc.)
>
>
>
>
>
> Also, I think: Making ourselves use a driver here also helps keep
> interfaces clean. This helps us avoid spaghetti code and makes things
> more maintainable in the long run.
>
>
> - The more flexibility increases complexity and makes
> it confusing for people to develop components. Should this
> concern go into the controller, the driver, or the amphora VM?
> Two of them? Three of them? Limiting choices makes it simpler
> to achieve that.
>
>
> "Centralize intelligence / decentralize workload." There will often
> be multiple ways we can solve certain problems, but if we try to
> follow this mantra, and use clean interfaces between components, it
> starts to become more clear which code strategies we should be
> following. Yes, it's sometimes hard to know the right way to do
> things-- which is why we end up having these wonderful debates. ;) But
> I don't think the answer is "this is hard, let's just lump everything
> together."
>
>
>
>
>
> Also, rule of thumb (perhaps not stated in our constitution... yet):
> Try to architect things so the most frequently deployed elements see
> the fewest changes. (This is actually related to the "centralize
> intelligence / decentralize workload" mantra in a round-about way:
> Central intelligence elements will be both fewer in number and more
> frequently changed than "dumb" workload components.) This makes
> managing change for large deployments easier. (Again, it's both easier
> and less risky to update 100 controllers versus 10,000+ amphorae.)
>
>
>
>
>
>
>
> HPs worry is that by creating the potential to run multiple
> (version of drivers) drivers, on multiple versions of
> controllers, on multiple versions of amphora devices creates a
> headache for testing. For example does the version 4.1 haproxy
> driver work with the cersion 4.2 controller on an 4.0 amphora
> device? Which compatibility matrix do we need to build/test?
> Limiting one driver to one controller can help with making
> that manageable.
>
>
>
>
>
> Ok, so, I think this is possibly where part of our misunderstanding
> comes from. I realize above that I said a single driver could talk to
> multiple versions of back-end amphorae via a couple methods, but let's
> ignore that for a minute and assume that we only test / assume drivers
> will be speaking with the latest version of the amphorae to which they
> correspond.
>
>
>
>
>
> I should probably clarify something that I've been assuming but may
> not be obvious: I'm assuming that the "version" of the amphorae
> (drawn mostly from the version of the glue scripts, agent, and other
> code we write which lives on the amphora) is numbered separately and
> moves at a different rate than the version of the driver. Think of
> this like the version of the firmware and version of the driver used
> with your printer. Sometimes a major bugfix entails updating both the
> firmware and driver. However, it's also common for a bugfix / feature
> enhancement to involve only updating the printer driver version and
> not the printer firmware.
>
>
>
>
>
> What I'm getting at here is that if we're doing the configuration
> rendering in the driver and not on the amphora, there will be some
> bugfixes / feature enhancements which only entail updating the driver
> because there are literally no changes that need to be made to the
> amphora for the bugfix / feature.
>
>
>
>
>
> Does this actually happen? Yes! To give a concrete example drawn from
> our product history: On our existing load balancer product, which is
> powered by stunnel + haproxy a new OpenSSL vulnerability was
> discovered, the fix for which was to add a line to the stunnel
> configuration disabling a certain kind of SSL negotiation. Since we
> were rendering configurations centrally on our controllers, all we
> needed to do was update the configuration template on our controller
> and push out new configs for anyone using SSL termination. Took
> literally 10 minutes to implement once we understood the problem, and
> we didn't have to touch or otherwise update the software or scripts
> running on our appliances at all.
>
>
>
>
>
> It's even easier for L7 feature enhancements: You don't even have to
> push anything out to the amphora, just update the controller / driver
> to expose the new feature and users can then start using it at will.
>
>
>
>
>
> Are all feature enhancements / bugfixes this easy? No! How do you tell
> the difference between which changes are major and minor? Anything
> which touches the code running on the amphora is "major" (ie. like a
> firmware update). Anything which only touches the controller / driver
> is "minor" (ie. like a driver update).
>
>
>
>
>
> It seems strange to me that we'd force even minor changes to
> configurations to be "major" updates for the sake of sending
> JSON-which-will-immediately-be-turned-into-haproxy.cfg over the wire
> instead of just the haproxy.cfg. :/
>
>
>
>
>
> So with that in mind: Please understand that your model and mine do
> not have to differ in the slightest when it comes to how to manage
> 'major' updates, whether that be running a different driver /
> controller for the new amphora version (Ick!), or doing on-demand lazy
> upgrades of amphora as the driver discovers old,
> incompatible-versioned amphora it needs to update (probably smoothest
> way to handle this, possibly as a default action of the option 2 I
> mentioned above), or whether we force all amphora to be updated as
> soon as possible after a controller update (most risky and probably
> not the best way to handle this). We've yet to define exactly how this
> workflow should be handled, but it's actually somewhat secondary to
> the problem of where to render the configs. (Maybe we should have a
> conversation about this in another thread?)
>
>
>
>
>
> And in any case, I'm not seeing a need to ensure the driver works with
> anything but the latest amphora image version to which it corresponds
> (again, keeping in mind that amphora image and driver should be
> allowed to change at different rates and are therefore versioned
> separately). :/ This is especially the case if we define the default
> action to be taken upon a failure to push out a new config to be to
> check the version of the amphora and upgrade as necessary (ie. lazy
> upgrading)...
>
>
>
>
>
> Also, not that we can't revisit this of course: But the v0.5
> component design entailing a "VM Driver" already went through gerrit
> review and was approved (by yourself even!) This discussion was
> originally about where to render the haproxy configs, but it really
> seems like y'all are against the idea of having an amphora driver
> interface at all. :/
>
>
>
>
>
> Stephen
>
>
>
>
>
>
>
>
>
> --
> Stephen Balukoff
> Blue Box Group, LLC
> (800)613-4305 x807
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
More information about the OpenStack-dev
mailing list