[openstack-dev] [oaktree] Follow up to Multi-cloud Management in OpenStack Summit session
Joshua Harlow
harlowja at fastmail.com
Wed Nov 29 01:14:26 UTC 2017
Small side-question,
Why would this just be limited to openstack clouds?
Would it be?
Monty Taylor wrote:
> Hey everybody!
>
> https://etherpad.openstack.org/p/sydney-forum-multi-cloud-management
>
> I've CC'd everyone who listed interest directly, just in case you're not
> already on the openstack-dev list. If you aren't, and you are in fact
> interested in this topic, please subscribe and make sure to watch for
> [oaktree] subject headings.
>
> We had a great session in Sydney about the needs of managing resources
> across multiple clouds. During the session I pointed out the work that
> had been started in the Oaktree project [0][1] and offered that if the
> people who were interested in the topic thought we'd make progress best
> by basing the work on oaktree, that we should bootstrap a new core team
> and kick off some weekly meetings. This is, therefore, the kickoff email
> to get that off the ground.
>
> All of the below is thoughts from me and a description of where we're at
> right now. It should all be considered up for debate, except for two
> things:
>
> - gRPC API
> - backend implementation based on shade
>
> As those are the two defining characteristics of the project. For those
> who weren't in the room, justifications for those two characteristics are:
>
> gRPC API
> --------
>
> There are several reasons why gRPC.
>
> * Make it clear this is not a competing REST API.
>
> OpenStack has a REST API already. This is more like a 'federation' API
> that knows how to talk to one or more clouds (similar to the kubernetes
> federation API)
>
> * Streaming and async built in
>
> One of the most costly things in using the OpenStack API is polling.
> gRPC is based on HTTP/2 and thus supports streaming and other exciting
> things. This means an oaktree running in or on a cloud can do its
> polling loops over the local network and the client can just either wait
> on a streaming call until the resource is ready, or can fire an async
> call and deal with it later on a notification channel.
>
> * Network efficiency
>
> Protobuf over HTTP/2 is a super-streamlined binary protocol, which
> should actually be really nice for our friends in Telco land who are
> using OpenStack for Edge-related tasks in 1000s of sites. All those
> roundtrips add up at scale.
>
> * Multi-language out of the box
>
> gRPC allows us to directly generate consistent consumption libs for a
> bunch of languages - or people can grab the proto files and integrate
> those into their own build if they prefer.
>
> * The cool kids are doing it
>
> To be fair, Jay Pipes and I tried to push OpenStack to use Protobuf
> instead of JSON for service-to-service communication back in 2010 - so
> it's not ACTUALLY a new idea... but with Google pushing it and support
> from the CNCF, gRPC is actually catching on broadly. If we're writing a
> new thing, let's lean forward into it.
>
> Backend implementation in shade
> -------------------------------
>
> If the service is defined by gRPC protos, why not implement the service
> itself in Go or C++?
>
> * Business logic to deal with cloud differences
>
> Adding a federation API isn't going to magically make all of those
> clouds work the same. We've got that fairly well sorted out in shade and
> would need to reimplement basically all of shade in other other language.
>
> * shade is battle tested at scale
>
> shade is what Infra's nodepool uses. In terms of high-scale API
> consumption, we've learned a TON of lessons. Much of the design inside
> of shade is the result of real-world scaling issues. It's Open Source,
> so we could obviously copy all of that elsewhere - but why? It exists
> and it works, and oaktree itself should be a scale-out shared-nothing
> kind of service anyway.
>
> The hard bits here aren't making API calls to 3 different clouds, the
> hard bits are doing that against 3 *different* clouds and presenting the
> results sanely and consistently to the original user.
>
> Proposed Structure
> ==================
>
> PTL
> ---
>
> As the originator of the project, I'll take on the initial PTL role.
> When the next PTL elections roll around, we should do a real election.
>
> Initial Core Team
> -----------------
>
> oaktree is still small enough that I don't think we need to be super
> protective - so I think if you're interested in working on it and you
> think you'll have the bandwidth to pay attention, let me know and I'll
> add you to the team.
>
> General rules of thumb I try to follow on top of normal OpenStack
> reviewing guidelines:
>
> * Review should mostly be about suitability of design/approach. Style
> issues should be handled by pep8/hacking (with one exception, see
> below). Functional issues should be handled with tests. Let the machines
> be machines and humans be humans.
>
> * Use followup patches to fix minor things rather than causing an
> existing patch to get re-spun and need to be re-reviewed.
>
> The one style exception ... I'm a big believer in not using visual
> indentation - but I can't seem to get pep8 or hacking to complain about
> its use. This isn't just about style - visual indentation causes more
> lines to be touched during a refactor than are necessary making the
> impact of a change harder to see.
>
> good:
>
> x = some_function(
> with_some, arguments)
>
> bad:
>
> x = some_function(with_some,
> arguments)
>
> If anyone can figure out how to write a hacking rule that enforces that
> I'll buy you a herd of chickens.
>
> Weekly Meeting
> --------------
>
> Let's give it a week or so to see who is interested in being in the
> initial core team so that we can figure out what timezones folks are in
> and pick a time that works for the maximum number of people.
>
> IRC Channel
> -----------
>
> oaktree development is closely related to shade development which is now
> in #openstack-sdks, so let's stay there until we get kicked out.
>
> Bugs/Blueprints
> ---------------
>
> oaktree uses storyboard [2][3]
>
> https://storyboard.openstack.org/#!/project/855
>
> oaktree tech overview
> =====================
>
> oaktree is a service that presents a gRPC API and that uses shade as its
> OpenStack connectivity layer. It's organized into two repos at the
> moment, oaktreemodel and oaktree. The intent is that anyone should be
> able to use the gRPC/protobuf definitions to create a client
> implementation. It is explicitly not the intent that there be more than
> one server implementation, since that would require reimplementing all
> of the hairy business logic that's in shade already.
>
> oaktreemodel contains the protobuf definitions, as well as the generated
> golang code. It is intended to provide a library that is easily pip
> installable by anyone who wants to build a client without them needing
> to add protoc steps. It contains build tooling to produce python, golang
> and C++. The C++ and python files are generated and included in the
> source sdist artifacts. Since golang uses git for consumption, the
> generated golang files are committed to the oaktreemodel repo.
>
> oaktreemodel has a more complex build toolchain, but is organized so
> that only oaktreemodel devs need to deal with it. People consuming
> oaktreemodel should not need to know anything about how it's built - pip
> install oaktreemodel or go get
> https://git.openstack.org/openstack/oaktreemodel should Just Work with
> no additional effort on the part of the programmer.
>
> oaktree contains the server implementation and depends on oaktreemodel.
> It's mostly a thin shim layer mapping gRPC stubs to shade calls. Much of
> the logic that needs to exist for oaktree to work wants to live in
> shade, but I'm sure we'll find places where that's not true.
>
> Ultra Short-Term TODO/in-progress
> =================================
>
> Fix Gate jobs for Zuul v3 (mordred)
> -----------------------------------
>
> We have devstack functional gate jobs - but they haven't been updated
> since the Zuul v3 migration. Duong Ha-Quang submitted a patch [4] to
> migrate the legacy jobs to in-tree. We need to get that fixed up, then
> migrate the job to use the new fancy devstack base job.
>
> I'll get this all fixed ASAP so that it's easy for folks to start
> hacking on patches.
>
> I'm working on this one.
>
> A patch for oaktreemodel is in flight [5]. We still needs a patch to
> oaktree to follow up, which I have half-finished. I'll get it up once
> the oaktreemodel patch is green.
>
> Short-Term TODOs
> ================
>
> Expose more things
> ------------------
>
> shade has *way* more capabilities than oaktree, which is mostly a matter
> of writing some proto definitions for resources that match the 'strict'
> version of shade's data model. In some cases it might mean that we need
> to define a data model contract in shade too... but by and large picking
> things and adding them is a great way to get familiar with all the
> pieces and how things flow together.
>
> We should also consider whether or not we can do any meta-programming to
> map shade calls into oaktree calls automatically. For now I think we
> should be fine with just having copy-pasta boilerplate until we
> understand enough about the patterns to abstract them - but we SHOULD be
> able to do some work to reduce the boilerplate.
>
> Write better tests
> ------------------
>
> There are gate jobs at the moment and a tiny smoke-test script. We
> should add some functional tests for python, go and C++ in the
> oaktreemodel repo.
>
> I'm not sure a TON of unittests in oaktreemodel will be super useful -
> however, some simple tests that verify we haven't borked something in
> the protos that cause code to be generated improperly would be great. We
> can do those just making sure we can create the proto objects and
> whatnot without needing an actual server running.
>
> Unittests in oaktree itself are likely to have very little value. We can
> always add more requests-mock unittests to shade/python-openstacksdk. I
> think we should focus more on functional tests and on making sure those
> tests can run against not just devstack.
>
> Shift calling interface from shade to python-openstacksdk
> ---------------------------------------------------------
>
> oaktree doesn't need historical compat, so we can go ahead and start
> using python-openstacksdk. Our tests will be cross-testing with master
> branch commits rather than releases right now anyway.
>
> Add Java and Ruby build plumbing to oaktree model
> -------------------------------------------------
>
> Protobuf/gRPC has support for java and ruby as well, we should plumb
> them through as well.
>
> Parallel Multicloud APIs
> ------------------------
>
> The existing APIs allow for multi-cloud consumption from the same
> connection via a Location object used as a parameter to calls.
> Additionally, shade adds a Location property to every object returned,
> so all shade objects carry the information needed to verify uniqueness.
>
> However, when considering actions like:
>
> "I want a list of all of my servers on all of my clouds"
>
> the answer is currently an end-user for-loop. We should add calls to
> shade for each of the list/search/get API calls that fetch from all of
> the available cloud regions in parallel and then combine the results
> into a single result list.
>
> We should also think about a design for multi-cloud creates and which
> calls they make sense for. Things like image and flavor immediately come
> to mind, as having consistent image and flavors across cloud regions is
> important.
>
> Both of those are desired features at the shade layer, so designing and
> implementing them will work great there ... but working on adding them
> to shade and exposing them in oaktree at the same time will help inform
> what shape of API at the shade layer serves the oaktree layer the best.
>
> Add REST escape hatch
> ---------------------
>
> There are PLENTY of things that will never get added to oaktree
> directly- especially things that are deployment/vendor-backend specific.
> One of the things discussed in Sydney was adding an API call to oaktree
> that would return a Protobuf that contains the root URL for a given
> service along with either a token, a list of HTTP Headers to be used for
> auth or both. So something like:
>
> conn = oaktreemodel.Connect()
> rest_info = conn.get_rest_info(
> location=Location(cloud='example.com', service_type='compute'))
> servers = requests.get(
> rest_info.url + '/servers',
> headers=rest_info.headers).json()
>
> or, maybe that's the gRPC call and there is a call in each language's
> client lib that returns a properly constructed rest client...
>
> conn = oaktreemodel.Connect()
> compute = conn.get_adapter(
> location=Location(cloud='example.com', service_type='compute'))
> servers = compute.get('/servers').json()
>
> *waves hands* - needs to be thought about, designed and implemented.
>
> Medium Term TODOs
> =================
>
> Authentication
> --------------
>
> oaktree is currently not authenticated. It works great on a laptop or in
> a location that's locked down through some other means, which should be
> fine for the first steps of the telco/edge use case, as well as for the
> developer use case getting started with it - but it's obviously not
> suitable for a multi-user service. The thinking thusfar has been to NOT
> use keystone for auth, since that introduces the need for having a gRPC
> auth plugin for clients, as well as doing some sort of REST/gRPC dance.
>
> BUT - whether that's the right choice and what the right choice actually
> is is an open question on purpose - getting input from the operators on
> what mechanism works best is important. Maybe making a keystone gRPC
> auth driver and using keystone is the right choice. Maybe it isn't.
> Let's talk about it.
>
> Authorization
> -------------
>
> Since it's currently only a single-user service, it operates off of a
> pre-existing local clouds.yaml to define which clouds it has access to.
> Long-term one can imagine that one would want to authorize an oaktree to
> talk to a particular cloud-region in some manner. This needs to be
> designed.
>
> Multi-user Caching
> ------------------
>
> oaktree currently uses the caching support in shade for its caching.
> Although it is based on dogpile.cache which means it has support for
> shared backends like redis or memcached, it hasn't really been vetted
> for multi-user sharing a single cache. It'll be fine for the next 6-9
> months, but once we go multi-user I'd be concerned about it - so we
> should consider the caching layer design.
>
> shade oaktreemodel backend
> --------------------------
>
> In an ultimate fit of snake eating its own tail, we should add support
> to shade for making client connections to an oaktree if one exists. This
> should obviously be pretty direct passthrough. That would mean that an
> oaktree talking to another oaktree would be able to do so via the gRPC
> layer without any intermediate protobuf-to-dict translation steps.
>
> That leads us to potentially just using the oaktreemodel protobuf
> objects as the basis for the in-memory resource objects inside of
> sdk/shade - but that's inception-y enough that we should just skip it
> for now. If protobuf->json translations are what's killing us, that's a
> great problem to have.
>
> Timetable
> =========
>
> I think we should aim for having something that's usable/discussable for
> the single/trusted-user use case for real work (install an oaktree
> yourself pointed at a clouds.yaml file and talk to it locally without
> auth) by the Dublin PTG. It doesn't have to do everything, but we should
> at least have a sense of whether this will solve the needs of the people
> who were interested in this topic so that we'll know whether figuring
> out the auth story is worth-while or if this is all a terrible idea.
>
> I think it's TOTALLY reasonable that by Vancouver we should have a thing
> that's legit usable for folks who have the pain point today (given the
> auth constraint)
>
> If that works out, discuss auth in Vancouver and aim to have it figured
> out and implemented by Berlin so that we can actually start pushing
> clouds to include oaktree in their deployments.
>
> Conclusion
> ==========
>
> Ok. That's the braindump from me. Let me know if you wanna dive in,
> we'll get a core team fleshed out and an IRC meeting set up and folks
> can start cranking.
>
> Thanks!
> Monty
>
> [0] http://git.openstack.org/cgit/openstack/oaktree
> [1] http://git.openstack.org/cgit/openstack/oaktreemodel
> [2] https://storyboard.openstack.org/#!/project/855
> [3] https://storyboard.openstack.org/#!/project/856
> [4] https://review.openstack.org/#/c/512561/
> [5] https://review.openstack.org/#/c/492531/
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
More information about the OpenStack-dev
mailing list