[openstack-dev] [TripleO] Should we have a TripleO API, or simply use Mistral?

Dan Prince dprince at redhat.com
Mon Jan 25 22:36:44 UTC 2016


On Mon, 2016-01-25 at 15:31 -0600, Ben Nemec wrote:
> On 01/22/2016 06:19 PM, Dan Prince wrote:
> > On Fri, 2016-01-22 at 11:24 -0600, Ben Nemec wrote:
> > > So I haven't weighed in on this yet, in part because I was on
> > > vacation
> > > when it was first proposed and missed a lot of the initial
> > > discussion,
> > > and also because I wanted to take some time to order my thoughts
> > > on
> > > it.
> > >  Also because my initial reaction...was not conducive to calm and
> > > rational discussion. ;-)
> > > 
> > > The tldr is that I don't like it.  To explain why, I'm going to
> > > make
> > > a
> > > list (everyone loves lists, right? Top $NUMBER reasons we should
> > > stop
> > > expecting other people to write our API for us):
> > > 
> > > 1) We've been down this road before.  Except last time it was
> > > with
> > > Heat.
> > >  I'm being somewhat tongue-in-cheek here, but expecting a general
> > > service to provide us a user-friendly API for our specific use
> > > case
> > > just
> > > doesn't make sense to me.
> > 
> > We've been down this road with Heat yes. But we are currently using
> > Heat for some things that we arguable should be (a workflows tool
> > might
> > help offload some stuff out of Heat). Also we haven't implemented
> > custom Heat resources for TripleO either. There are mixed opinions
> > on
> > this but plugging in your code to a generic API is quite nice
> > sometimes.
> > 
> > That is the beauty of Mistral I think. Unlike Heat it actually
> > encourages you to customize it with custom Python actions. Anything
> > we
> > want in tripleo-common can become our own Mistral action (these get
> > registered with stevedore entry points so we'd own the code) and
> > the
> > YAML workflows just tie them together via tasks.
> > 
> > We don't have to go off and build our own proxy deployment workflow
> > API. The structure to do just about anything we need already exists
> > so
> > why not go and use it?
> > 
> > > 
> > > 2) The TripleO API is not a workflow API.  I also largely missed
> > > this
> > > discussion, but the TripleO API is a _Deployment_ API.  In some
> > > cases
> > > there also happens to be a workflow going on behind the scenes,
> > > but
> > > honestly that's not something I want our users to have to care
> > > about.
> > 
> > Agree that users don't have to care about this.
> > 
> > Users can get as involved as they want here. Most users I think
> > will
> > use python-tripleoclient to drive the deployment or the new UI.
> > They
> > don't have to interact with Mistral directly unless they really
> > want
> > to. So whether we choose to build our own API or use a generic one
> > I
> > think this point is mute.
> 
> Okay, I think this is a very fundamental point, and I believe it gets
> right to the heart of my objection to the proposed change.
> 
> When I hear you say that users will use tripleoclient to talk to
> Mistral, it raises a big flag.  Then I look at something like
> https://github.com/dprince/python-tripleoclient/commit/77ffd2fa7b1642
> b9f05713ca30b8a27ec4b322b7
> and the flag gets bigger.
> 
> The thing is that there's a whole bunch of business logic currently
> sitting in the client that shouldn't/can't be there.  There are
> historical reasons for it, but the important thing is that the
> current
> client architecture is terribly flawed.  Business logic should never
> live in the client like it does today.

Totally agree here. In fact I have removed business logic from python-
tripleoclient in this patch and moved it into a Mistral action. Which
can then be used via a stable API from anywhere.

> 
> Looking at that change, I see a bunch of business logic around taking
> our configuration and passing it to Mistral.  In order for us to do
> something like that and have a sustainable GUI, that code _has_ to
> live
> behind an API that the GUI and CLI alike can call.  If we ask the GUI
> to
> re-implement that code, then we're doomed to divergence between the
> CLI
> and GUI code and we'll most likely end up back where we are with a
> GUI
> that can't deploy half of our features because they were implemented
> solely with the CLI in mind and made assumptions the GUI can't meet.

The latest feedback I've gotten from working with the UI developers on
this was that we should have a workflow to create the environment. That
would get called via the Mistral API via python-tripleoclient and any
sort of UI you could imagine and would essentially give us a stable
environment interface.

This would also allow us to version the types of Mistral environments
we create for use with workflows that support the various version
(should we choose to take it to this level).

Rather than focus on the environments mechanism I rather meant this
prototype to be a sort of demonstration to show how we could call a
workflow, how the code would cleanly move out of python-tripleoclient
and into tripleo-common where it becomes a Mistral action, etc. I
needed the environment too... apologies for not taking the example
further (I'm working as quickly as I can).

Be assured the code to create the environment could easily be
implemented as a workflow API call, where we have validations it etc,
etc. and it can be called by a UI or CLI in an equally useful fashion.

> 
> As I said, this is a really fundamental part of the argument for
> creating a REST API for TripleO.  A huge reason Tuskar UI didn't work
> was that it had to reimplement all of the logic in
> tripleoclient.  Two
> parallel implementations in different languages is not a sustainable
> model of development, and on top of that developers will always focus
> on
> the CLI, which can do a lot of things the UI can't.  That was the
> straw
> that broke Tuskar UI's back in the end - new features like network
> isolation and Ceph were designed for the CLI, and had requirements
> the
> UI simply couldn't meet in a sane fashion.

And that is the fundamental part of this for me as well. If you look
closely at my example you'll notice that I'm using an API for
everything (ignore the environment part for now please because as I
explained above the latest feedback is we'd rather use a workflow to
create that...). In my example python-tripleoclient calls the workflow
using the same API that we would also consume via a UI. Contrast this
with what we are actually implementing in tripleo-common today, which
is we are initially calling the tripleo-common Python library directly.
I think the end goal is that we wouldn't do this, but we are for now...
so I would argue that actually in this regard my Mistral demo is a step
ahead, not behind where we want to be here.

> 
> It's not like we undertook the task of writing an API lightly.  In
> fact,
> I initially argued against it myself, but after talking to the GUI
> folks
> it was explained that just sticking all of our code in a Python
> library
> doesn't actually solve their problems.  They need something they can
> talk to (read: a REST API) that can handle the business logic.  This
> is
> the problem the TripleO API was designed to solve, not simply the
> task
> of running some pre-defined OpenStack API calls.  Which is why one of
> my
> first points was "is not a workflow API".

Ben, I am working with UI developers. I'm listening to their needs and
developing API driven workflows to do the required steps for deploying
via a UI. I'm trying to prototype and demonstrate how quickly and
easily it would be to wire those workflows up in such a manner that we
can use them from python-tripleoclient and/or any UI at the same time,
all via a generic stable workflow API. I would argue that it has been a
success.

The hard question being asked of TripleO now (in particular TripleO
cores) is given all this do we still want to go off and build our own
API. And if we do what if any value do we get from it vs. a solution
like Mistral?

Dan


> 
> I realize I've now typed enough that everyone probably tuned out a
> few
> paragraphs ago, but I hope somewhere in that wall of text I've
> explained
> what I see as a disconnect between this proposal and what the TripleO
> API actually is.  There's a whole bunch more discussion that needs to
> happen beyond this, but I think until we're on the same page
> regarding
> the intent of the API we're not going to make meaningful progress
> here.
> 
> > 
> > > 
> > > 3) It ties us 100% to a given implementation.  If Mistral proves
> > > to
> > > be a
> > > poor choice for some reason, or insufficient for a particular use
> > > case,
> > > we have no alternative.  If we have an API and decide to change
> > > our
> > > implementation, nobody has to know or care.  This is kind of the
> > > whole
> > > point of having an API - it shields users from all the nasty
> > > implementation details under the surface.
> > 
> > 
> > Mistal's API is a generic workflow API. It is very much the same
> > layer
> > that I think we would get if we were to integrate with something
> > like
> > Ansible Tower... except that Mistral is part of OpenStack. It
> > integrates very nicely with OpenStack services and is very
> > customizable
> > with custom actions. The fact that Mistral sits much closer to
> > OpenStack and is essentially a light shim on top of it is to our
> > advantage (being TripleO). To think that we can build up a proxy
> > API in
> > such a manner that we might be able to swap in an entirely new
> > backend
> > (without even having a fully implement backend yet to begin with)
> > is
> > for me a bit of a stretch. We've got a lot of "TripleO API"
> > maturing
> > before we'll get to this point. Which is why I lean towards using a
> > generic workflow API to accomplis the same task.
> > 
> > I actually think rather than shielding users we should be more
> > transparent about the actual workflows that are driving deployment.
> > Smaller more focused workflows that we string together to drive the
> > deployment.
> > 
> > > 
> > > 4) It raises the bar even further for both new deployers and
> > > developers.
> > >  You already need to have a pretty firm grasp of Puppet and Heat
> > > templates to understand how our stuff works, not to mention a
> > > decent
> > > understanding of quite a number of OpenStack services.
> > > 
> > > This presents a big chicken and egg problem for people new to
> > > OpenStack.
> > >  It's great that we're based on OpenStack and that allows people
> > > to
> > > peek
> > > under the hood and do some tinkering, but it can't be required
> > > for
> > > everyone.  A lot of our deployers are going to have little to no
> > > OpenStack experience, and TripleO is already a daunting task for
> > > those
> > > people (hell, it's daunting for people who _are_ experienced).
> > > 
> > 
> > And on the flipside you will get more of a community around using
> > an
> > OpenStack project than you ever would going off and building your
> > own
> > "Deployment/Workflow API". 
> > 
> > I would actually argue this is less of a deployers thing and more
> > of a
> > development tool choice. IMO most deployers will use python-
> > tripleoclient or some UI and not mistralclient directly. The code
> > I've
> > posted this week shows a prototype of just this, Mistral is swapped
> > in
> > such that you would never know it was involved because python-
> > tripleoclient works like it always did. Deployers use our CLI and
> > UI
> > tools like they always have, and developers gain a community of
> > Mistral
> > developers (and documentation) which they can interact with on
> > common
> > problems. Sounds like a win/win to me.
> > 
> > 
> > > 5) What does reimplementing all of our tested, well-understood
> > > Python
> > > into a new YAML format gain us?  This is maybe the biggest thing
> > > I'm
> > > missing from this whole discussion.  We lose a bunch of things
> > > (ease
> > > of
> > > transition from other Python projects, excellent existing testing
> > > framework, etc.), but what are we actually gaining other than the
> > > ability to say that we use N + 1 OpenStack services?  Because
> > > we're
> > > way
> > > past the point where "It's OpenStack deploying OpenStack" is
> > > sufficient
> > > reason for people to pay attention to us.  We need less "Ooh,
> > > neat"
> > > and
> > > more "Ooh, that's easy to use and works well."  It's still not
> > > clear
> > > to
> > > me that Mistral helps in any way with the latter.
> > 
> > Nobody suggested we reimplement everything. Much of the plan to
> > move
> > code into tripleo-common would stay. Instead of building our own
> > API
> > we'd just skip all that and focus on the code that is actually
> > about
> > our deployments in the form of custom Mistral actions and YAML
> > workflows.
> > 
> > The YAML workflows just ties together actions which are actually
> > all
> > written in Python. YAML works quite well for this and is a whole
> > lot
> > less verbose than writting everything we have in Python. There is a
> > reason Heat, Ansible, and Mistral use YAML for these things... and
> > I
> > think it works well. Understood you have an opinion on this, but I
> > don't share the view that everything works better when written in
> > Python. Take Puppet for example, we interface with that via Hiera.
> > 
> > People will pay attention because we'll be able to add features
> > faster.
> > By not having to build our own API and plumbing we can focus on
> > actual
> > problems rather than boilerplate Python API code.
> > 
> > > 
> > > 6) On the testing note, how do we test these workflows?  Do we
> > > know
> > > what
> > > happens when step X fails?  How do we test that they handle it
> > > properly
> > > in an automated and repeatable way?  In Python these are largely
> > > easy
> > > questions to answer: unit tests.  How do you unit test YAML? 
> > 
> > The actions are all unit testable Python.
> > 
> > The workflows themselves would all get tested as part of our CI.
> > With
> > Mistral workflows and the integration I'm proposing with both the
> > CLI
> > and UI we'd have the same API driven workflows tested in both
> > cases. We
> > don't short circuit the API and call into a library like we are
> > doing
> > today for tripleo-common.
> > 
> > 
> > >  This is a
> > > big reason I'm not even crazy about having Mistral on the back
> > > end of
> > > a
> > > TripleO API.  We'd be going from code that we can test and prove
> > > works
> > > in a variety of scenarios, to YAML that is tested and proven to
> > > work
> > > in
> > > exactly the three scenarios we run in CI.  This is basically the
> > > same
> > > situation we had with tripleo-incubator, and it was bad there
> > > too.
> > > 
> > > I dunno.  Maybe I'm too late to this party to have any impact on
> > > the
> > > discussion, but I very much do not like the direction we're going
> > > and
> > > I
> > > would be remiss if I didn't at least point out my concerns with
> > > it.
> > 
> > You aren't late to the party. But I would encourage you to look
> > closely
> > at the Mistral demos and examples that have been posted to
> > openstack-
> > dev before commenting further. Try them out, try Ansible (tower),
> > try
> > Mistral, and then come back and have a hard look at what we are
> > trying
> > to do by building our own TripleO API.
> > 
> > To me the crux of the problem isn't that we should expect other
> > projects to build our APIs for us. Rather it is using the right
> > tools
> > for the right jobs. TripleO has gotten off on the wrong path a few
> > times. We tried to roll our own config manage tooling and that
> > didn't
> > work out so well. I hate to see us go down the path of trying to
> > write
> > our own deployment/workflow API when in fact we've already got what
> > exactly what we need in OpenStack already. And a community already
> > exists around it as well...
> > 
> > Dan
> > 
> > > 
> > > -Ben
> > > 
> > > On 01/13/2016 03:41 AM, Tzu-Mainn Chen wrote:
> > > > Hey all,
> > > > 
> > > > I realize now from the title of the other TripleO/Mistral
> > > > thread
> > > > [1] that
> > > > the discussion there may have gotten confused.  I think using
> > > > Mistral for
> > > > TripleO processes that are obviously workflows - stack
> > > > deployment,
> > > > node
> > > > registration - makes perfect sense.  That thread is exploring
> > > > practicalities
> > > > for doing that, and I think that's great work.
> > > > 
> > > > What I inappropriately started to address in that thread was a
> > > > somewhat
> > > > orthogonal point that Dan asked in his original email, namely:
> > > > 
> > > > "what it might look like if we were to use Mistral as a
> > > > replacement
> > > > for the
> > > > TripleO API entirely"
> > > > 
> > > > I'd like to create this thread to talk about that; more of a
> > > > 'should we'
> > > > than 'can we'.  And to do that, I want to indulge in a thought
> > > > exercise
> > > > stemming from an IRC discussion with Dan and others.  All,
> > > > please
> > > > correct me
> > > > if I've misstated anything.
> > > > 
> > > > The IRC discussion revolved around one use case: deploying a
> > > > Heat
> > > > stack
> > > > directly from a Swift container.  With an updated patch, the
> > > > Heat
> > > > CLI can
> > > > support this functionality natively.  Then we don't need a
> > > > TripleO
> > > > API; we
> > > > can use Mistral to access that functionality, and we're done,
> > > > with
> > > > no need
> > > > for additional code within TripleO.  And, as I understand it,
> > > > that's the
> > > > true motivation for using Mistral instead of a TripleO API:
> > > > avoiding custom
> > > > code within TripleO.
> > > > 
> > > > That's definitely a worthy goal... except from my perspective,
> > > > the
> > > > story
> > > > doesn't quite end there.  A GUI needs additional functionality,
> > > > which boils
> > > > down to: understanding the Heat deployment templates in order
> > > > to
> > > > provide
> > > > options for a user; and persisting those options within a Heat
> > > > environment
> > > > file.
> > > > 
> > > > Right away I think we hit a problem.  Where does the code for
> > > > 'understanding
> > > > options' go?  Much of that understanding comes from the
> > > > capabilities map
> > > > in tripleo-heat-templates [2]; it would make sense to me that
> > > > responsibility
> > > > for that would fall to a TripleO library.
> > > > 
> > > > Still, perhaps we can limit the amount of TripleO code.  So to
> > > > give
> > > > API
> > > > access to 'getDeploymentOptions', we can create a Mistral
> > > > workflow.
> > > > 
> > > >   Retrieve Heat templates from Swift -> Parse capabilities map
> > > > 
> > > > Which is fine-ish, except from an architectural perspective
> > > > 'getDeploymentOptions' violates the abstraction layer between
> > > > storage and
> > > > business logic, a problem that is compounded because
> > > > 'getDeploymentOptions'
> > > > is not the only functionality that accesses the Heat templates
> > > > and
> > > > needs
> > > > exposure through an API.  And, as has been discussed on a
> > > > separate
> > > > TripleO
> > > > thread, we're not even sure Swift is sufficient for our needs;
> > > > one
> > > > possible
> > > > consideration right now is allowing deployment from templates
> > > > stored in
> > > > multiple places, such as the file system or git.
> > > > 
> > > > Are we going to have duplicate 'getDeploymentOptions' workflows
> > > > for
> > > > each
> > > > storage mechanism?  If we consolidate the storage code within a
> > > > TripleO
> > > > library, do we really need a *workflow* to call a single
> > > > function?  Is a
> > > > thin TripleO API that contains no additional business logic
> > > > really
> > > > so bad
> > > > at that point?
> > > > 
> > > > My gut reaction is to say that proposing Mistral in place of a
> > > > TripleO API
> > > > is to look at the engineering concerns from the wrong
> > > > direction.  The
> > > > Mistral alternative comes from a desire to limit custom TripleO
> > > > code at all
> > > > costs.  I think that is an extremely dangerous attitude that
> > > > leads
> > > > to
> > > > compromises and workarounds that will quickly lead to a shaky
> > > > code
> > > > base
> > > > full of design flaws that make it difficult to implement or
> > > > extend
> > > > any
> > > > functionality cleanly.
> > > > 
> > > > I think the correct attitude is to simply look at the problem
> > > > we're
> > > > trying to solve and find the correct architecture.  For these
> > > > get/set
> > > > methods that the API needs, it's pretty simple: storage -> some
> > > > logic ->
> > > > a REST API.  Adding a workflow engine on top of that is
> > > > unneeded,
> > > > and I
> > > > believe that means it's an incorrect solution.
> > > > 
> > > > 
> > > > Thanks,
> > > > Tzu-Mainn Chen
> > > > 
> > > > 
> > > > 
> > > > [1] http://lists.openstack.org/pipermail/openstack-dev/2016-Jan
> > > > uary
> > > > /083757.html
> > > > [2] https://github.com/openstack/tripleo-heat-templates/blob/ma
> > > > ster
> > > > /capabilities_map.yaml
> > > > 
> > > > _______________________________________________________________
> > > > ____
> > > > _______
> > > > OpenStack Development Mailing List (not for usage questions)
> > > > Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:
> > > > unsu
> > > > bscribe
> > > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-d
> > > > ev
> > > > 
> > > 
> > > 
> > > _________________________________________________________________
> > > ____
> > > _____
> > > OpenStack Development Mailing List (not for usage questions)
> > > Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:un
> > > subs
> > > cribe
> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 



More information about the OpenStack-dev mailing list