[openstack-dev] [TripleO] Should we have a TripleO API, or simply use Mistral?

Steven Hardy shardy at redhat.com
Tue Jan 26 14:08:09 UTC 2016


On Tue, Jan 26, 2016 at 07:40:05AM -0500, James Slagle wrote:
>    On Tue, Jan 26, 2016 at 4:46 AM, Steven Hardy <shardy at redhat.com> wrote:
> 
>      On Mon, Jan 25, 2016 at 05:45:30PM -0600, Ben Nemec wrote:
>      > On 01/25/2016 03:56 PM, Steven Hardy wrote:
>      > > On Fri, Jan 22, 2016 at 11:24:20AM -0600, Ben Nemec wrote:
>      > >> So I haven't weighed in on this yet, in part because I was on
>      vacation
>      > >> when it was first proposed and missed a lot of the initial
>      discussion,
>      > >> and also because I wanted to take some time to order my thoughts on
>      it.
>      > >>  Also because my initial reaction...was not conducive to calm and
>      > >> rational discussion. ;-)
>      > >>
>      > >> The tldr is that I don't like it.  To explain why, I'm going to
>      make a
>      > >> list (everyone loves lists, right? Top $NUMBER reasons we should
>      stop
>      > >> expecting other people to write our API for us):
>      > >>
>      > >> 1) We've been down this road before.  Except last time it was with
>      Heat.
>      > >>  I'm being somewhat tongue-in-cheek here, but expecting a general
>      > >> service to provide us a user-friendly API for our specific use case
>      just
>      > >> doesn't make sense to me.
>      > >
>      > > Actually, we've been down this road before with Tuskar, and
>      discovered that
>      > > designing and maintaining a bespoke API for TripleO is really hard.
>      >
>      > My takeaway from Tuskar was that designing an API that none of the
>      > developers on the project use is doomed to fail.  Tuskar also
>      suffered
>      > from a lack of some features in Heat that the new API is explicitly
>      > depending on in an attempt to avoid many of the problems Tuskar had.
>      >
>      > Problem #1 is still developer apathy IMHO though.
> 
>      I think the main issue is developer capacity - we're a small community
>      and
>      I for one am worried about the effort involved with building and
>      maintaining a bespoke API - thus this whole discussion is essentially
>      about
>      finding a quicker and easier way to meet the needs of those needing an
>      API.
> 
>      In terms of apathy, I think as a developer I don't need an abstraction
>      between me, my templates and heat.  Some advanced operators will feel
>      likewise, others won't.  What I would find useful sometimes is a
>      general
>      purpose workflow engine, which is where I think the more pluggable
>      mistral
>      based solution may have advantages in terms of developer and advanced
>      operator uptake.
>      > > I somewhat agree that heat as an API is insufficient, but that
>      doesn't
>      > > necessarily imply you have to have a TripleO specific abstraction,
>      just
>      > > that *an* abstraction is required.
>      > >
>      > >> 2) The TripleO API is not a workflow API.  I also largely missed
>      this
>      > >> discussion, but the TripleO API is a _Deployment_ API.  In some
>      cases
>      > >> there also happens to be a workflow going on behind the scenes, but
>      > >> honestly that's not something I want our users to have to care
>      about.
>      > >
>      > > How would you differentiate between "deployment" in a generic sense
>      in
>      > > contrast to a generic workflow?
>      > >
>      > > Every deployment I can think of involves a series of steps,
>      involving some
>      > > choices and interactions with other services.  That *is* a
>      workflow?
>      >
>      > Well, I mean if we want to take this to extremes then pretty much
>      every
>      > API is a workflow API.  You make a REST call, a "workflow" happens in
>      > the service, and you get back a result.
>      >
>      > Let me turn this around: Would you implement Heat's API on Mistral? 
>      All
>      > that happens when I call Heat is that a series of OpenStack calls are
>      > made from heat-engine, after all.  Or is that a gross
>      oversimplification
>      > of what's happening?  I could argue that the same is true of this
>      > discussion. :-)
> 
>      As Hugh has mentioned the main thing Heat does is actually manage
>      dependencies.  It processes the templates, builds a graph, then walks
>      the
>      graph running a "workflow" to create/update/delete/etc each resource.
> 
>      I could imagine a future where we interface to some external workflow
>      tool to
>      e.g do each resource action (e.g create a nova server, poll until it's
>      active),
>      however that's actually a pretty high overhead approach, and it'd
>      probably
>      be better to move towards better use of notifications instead (e.g less
>      internal workflow)
>      > >> 3) It ties us 100% to a given implementation.  If Mistral proves
>      to be a
>      > >> poor choice for some reason, or insufficient for a particular use
>      case,
>      > >> we have no alternative.  If we have an API and decide to change
>      our
>      > >> implementation, nobody has to know or care.  This is kind of the
>      whole
>      > >> point of having an API - it shields users from all the nasty
>      > >> implementation details under the surface.
>      > >
>      > > This is a valid argument, but building (and maintining, forever) a
>      bespoke
>      > > API is a high cost to pay for this additional degree of abstraction,
>      and
>      > > when you think of the target audience, I'm not certain it's entirely
>      > > justified (or, honestly, if our community can bear that overhead);
>      > >
>      > > For example, take other single-purpose "deployment" projects, such
>      as
>      > > Sahara, Magnum, perhaps Trove.  These are designed primarily as
>      user-facing
>      > > API's, where the services will ultimately be consumed by public and
>      private
>      > > cloud customers.
>      > >
>      > > Contrast with TripleO, where our target audience is, for the most
>      part,
>      > > sysadmins who deploy and maintain an openstack deployment over a
>      long
>      > > period of time.  There are two main groups here:
>      > >
>      > > 1. PoC "getting started" folks who need a very simple on-ramp
>      (generalizing
>      > > somewhat, the audience for the opinionated deployments driven via
>      UI's)
>      > >
>      > > 2. Seasoned sysadmins who want plugability, control and flexibility
>      above
>      > > all else (and, simplicity and lack of extraneous abstractions)
>      > >
>      > > A bespoke API potentially has a fairly high value to (1), but a very
>      low or
>      > > even negative value to (2).  Which is why this is turning out to be
>      a tough
>      > > and polarized discussion, unfortunately.
>      >
>      > Well, to be honest I'm not sure we can satisfy the second type of user
>      > with what we have today anyway.  Our Heat-driven puppet is hardly
>      > lightweight or simple, and there are extraneous abstractions all over
>      > the place (see also every place that we have a Heat template param
>      that
>      > exists solely to get plugged into a puppet hiera file :-).
>      >
>      > To me, this is mostly an artifact of the original intent of the Heat
>      > templates being _the_ abstraction that would then be translated into
>      > os-*-config, puppet, or [insert deployment tool of choice] by the
>      > templates, and I have to admit I'm not sure how to fix it for these
>      users.
> 
>      I think we fix it by giving them a choice.  E.g along the lines of the
>      "split stack" approach discussed at summit - allow operators to choose
>      either pre-defined roles with known interfaces (parameters), or deploy
>      just
>      the infrastructure (servers, networking, maybe storage) then drive
>      configuration tooling with a much thinner interface.
>      > So I guess the question is, how does having an API hurt those power
>      > users?  They'll still be able/have to edit Heat templates to deploy
>      > additional services.  They'll still have all the usual openstack
>      clients
>      > to customize their Ironic or Nova setups.  They're already using an
>      API
>      > today, it's just one written entirely in the client.
> 
>      There's already a bunch of opaque complexity inside the client and
>      TripleO
>      common, adding a very rigid API makes it more opaque, and harder to
>      modify.
>      > On the other hand, an API that can help guide a user through the
>      deploy
>      > process (You say you want network isolation enabled?  Well here are
>      the
>      > parameters you need to configure...) could make a huge difference for
>      > the first type of user, as would _any_ API usable by the GUI (people
>      > just like pretty GUIs, whether it's actually better than the CLI or
>      not :-).
>      >
>      > I guess it is somewhat subjective as you say, but to me the API
>      doesn't
>      > significantly affect the power user experience, but it would massively
>      > improve the newbie experience.  That makes it a win in my book.
> 
>      I agree 100% that we need to massively improve the newbie experience - I
>      think everybody does.  I also think we also all agree there must be a
>      stable, versioned API that a UI/CLI can interact with.
> 
>      The question in my mind is, can we address that requirement *and*
>      provide
>      something of non-negative value for developers and advanced operators.
> 
>      Ryan already commented earlier in this thread (and I agree having seen
>      Dan's most recent PoC in action) that it doesn't make a lot of
>      difference
>      from a consumer-of-api perspective which choice we make in terms of APi
>      impelementation, either approach can help provide the stable API surface
>      that is needed.
> 
>      The main difference is, only one choice provides any flexibility at all
>      wrt
>      operator customization (unless we reinvent a similar action plugin
>      mechanism
>      inside a TripleO API).
>      > >> 4) It raises the bar even further for both new deployers and
>      developers.
>      > >>  You already need to have a pretty firm grasp of Puppet and Heat
>      > >> templates to understand how our stuff works, not to mention a
>      decent
>      > >> understanding of quite a number of OpenStack services.
>      > >
>      > > I'm not really sure if a bespoke WSGI app vs an existing one
>      (mistral)
>      > > really makes much difference at all wrt raising the bar.  I see it
>      > > primarily as in implementation detail tbh.
>      >
>      > I guess that depends.  Most people in OpenStack already know at least
>      > some Python, and if you've done any work in the other projects there's
>      a
>      > fair chance you are familiar with the Python clients.  How many
>      people
>      > know Mistral YAML?
> 
>      So, I think you're conflating the OpenStack developer community (who,
>      mostly, know python), with end-users and Operators, where IME the same
>      is
>      often not true.
> 
>      Think of more traditional enterprise environments - how many sysadmins
>      on
>      the unix team are hardcore python hackers?  Not that many IME (ignoring
>      more devops style environments here).
>      > Maybe I'm overestimating the Python knowledge in the community, and
>      > underestimating the Mistral knowledge, but I would bet we're talking
>      > order(s?) of magnitude in terms of the difference.  And I'm not
>      saying
>      > learning Mistral is a huge task on its own, but it's one more thing in
>      a
>      > project full of one more things.
> 
>      It's one more thing, which is already maintained and has an active
>      community, vs yet-another-bespoke-special-to-tripleo-thing.  IMHO we
>      have
>      *way* too many tripleo specific things already.
> 
>      However, lets look at the "python knowledge" thing in a bit more detail.
> 
>      Let's say, as an operator I want to wire in a HTTP call to an internal
>      asset
>      management system.  The requirement is to log an HTTP call with some
>      content every time an overcloud is deployed or updated.  (This sort of
>      requirement is *very* common in enterprise environments IME)
> 
>      In the mistral case[1], the modification would look something like:
> 
>      http_task:
>        action: std.http url='assets.foo.com' <some arguments>
> 
>      You'd simply add two lines to your TripleO deployment workflow yaml[2]:
> 
>    â**This is where the argument for Mistral really breaks down for me. One
>    of the advantages of Mistral shouldn't be that it makes it easier for
>    operators to modify TripleO delivered workflows. If that becomes
>    necessary, we haven't implemented the solution in a flexible enough way.
>    Maybe you're just illustrating an example here of someone who is
>    completely set on forking TripleO. But in that case, then the example
>    isn't really all that relevant since we shouldn't be making a technical
>    choice based on that use case.â**
>    Your points below about which would be easier, modifying python code or
>    yaml files, really apply either way. The argument seems to be "let's use
>    Mistral because it's backed by yaml which is easier for operators to
>    modify".

How is wanting to make a single HTTP request to some non-openstack system
the same as "completely set on forking TripleO"?  Sorry, but I can't
reconcile my simple (and IME realistic, from working with actual customers)
example with your response at all.

Lets use another example - you have a proprietary revision control system,
and you want to pull your golden templates from there, instead of from
swift or the local filesystem.  Same problem!

The point is, it's *far* easier to make and maintain a simple change to a
relatively constrained but general purpose workflow interface than it is to
fork and maintain a bunch of python code indefinitely (unless we reinvent a
plugin interface exactly like Mistral already has).

Even if we were to decide the workflows were strictly internal
the same argument holds for developers - lets say the requirement I outline
appears on your backlog tomorrow - how many days of python development will
it take, vs the two-line change I outline?

It's only an example, but I'm trying to illustrate there is potentially
value in not reinventing every.single.wheel every time :)

Cheers,

Steve



More information about the OpenStack-dev mailing list