[OpenStack-Infra] On the subject of HTTP interfaces and Zuul
Paul Belanger
pabelanger at redhat.com
Fri Jun 9 19:54:31 UTC 2017
On Fri, Jun 09, 2017 at 12:35:59PM -0700, Clark Boylan wrote:
> On Fri, Jun 9, 2017, at 09:22 AM, Monty Taylor wrote:
> > Hey all!
> >
> > Tristan has recently pushed up some patches related to providing a Web
> > Dashboard for Zuul. We have a web app for nodepool. We already have the
> > Github webhook receiver which is inbound http. There have been folks who
> > have expressed interest in adding active-REST abilities for performing
> > actions. AND we have the new websocket-based log streaming.
> >
> > We're currently using Paste for HTTP serving (which is effectively
> > dead), autobahn for websockets and WebOB for request/response processing.
> >
> > This means that before we get too far down the road, it's probably time
> > to pick how we're going to do those things in general. There are 2
> > questions on the table:
> >
> > * HTTP serving
> > * REST framework
> >
> > They may or may not be related, and one of the options on the table
> > implies an answer for both. I'm going to start with the answer I think
> > we should pick:
> >
> > *** tl;dr ***
> >
> > We should use aiohttp with no extra REST framework.
> >
> > Meaning:
> >
> > - aiohttp serving REST and websocket streaming in a scale-out tier
> > - talking RPC to the scheduler over gear or zk
> > - possible in-process aiohttp endpoints for k8s style health endpoints
> >
> > Since we're talking about a web scale-out tier that we should just have
> > a single web tier for zuul and nodepool. This continues the thinking
> > that nodepool is a component of Zuul.
>
> I'm not sure that this is a great idea. We've already seen that people
> have wanted to use nodepool without a Zuul and even without performing
> CI. IIRC paul wanted to use it to keep a set of asterisks floating
> around for example. We've also seen that people want to use
> subcomponents of nodepool to build and manage a set of images for clouds
> without making instances.
>
Ya, asterisk use case aside, I think image build as a service is a prime example
of something nodepool could be great at on its own. Especially now that
nodepool-builder is scaling out very well with zookeeper.
> In the past we have been careful to keep logical tools separate which
> has made it easy for us to add new tools and remove old ones.
> Operationally this may be perceived as making things more difficult to a
> newcomer, but it makes life much much better 3-6 months down the road.
>
> >
> > In order to write zuul jobs, end-users must know what node labels are
> > available. A zuul client that says "please get me a list of available
> > node labels" could make sense to a user. As we get more non-OpenStack
> > users, those people may not have any concept that there is a separate
> > thing called "nodepool".
> >
> > *** The MUCH more verbose version ***
> >
> > I'm now going to outline all of the thoughts and options I've had or
> > have heard other people say. It's an extra complete list - there are
> > ideas in here you might find silly/bad. But since we're picking a
> > direction, I think it's important we consider the options in front of us.
> >
> > This will cover 3 http serving options:
> >
> > - WSGI
> > - aiohttp
> > - gRPC
> >
> > and 3 REST framework options:
> >
> > - pecan
> > - flask-restplus
> > - apistar
> >
> > ** HTTP Serving **
> >
> > WSGI
> >
> > The WSGI approach is one we're all familiar with and it works with
> > pretty much every existing Python REST framework. For us I believe if we
> > go this route we'd want to serve it with something like uwsgi and
> > Apache. That adds the need for an Apache layer and/or management uwsgi
> > process. However, it means we can make use of normal tools we all likely
> > know at least to some degree.
>
> FWIW I don't think Apache would be required. uWSGI is a fairly capable
> http server aiui. You can also pip install uwsgi so the simple case
> remains fairly simple I think.
>
> >
> > A downside is that we'll need to continue to handle our Websockets work
> > independently (which is what we're doing now)
> >
> > Because it's in a separate process, the API tier will need to make
> > requests of the scheduler over a bus, which could be either gearman or
> > zk.
> >
>
> Note that OpenStack has decided that this is a better solution than
> using web servers in the python process. That doesn't necessarily mean
> it is the best choice for Zuul, but it seems like there is a lot we can
> learn from the choice to switch to WSGI in OpenStack.
>
> > aiohttp
> >
> > Zuul v3 is Python3, which means we can use aiohttp. aiohttp isn't
> > particularly compatible with the REST frameworks, but it has built-in
> > route support and helpers for receiving and returning JSON. We don't
> > need ORM mapping support, so the only thing we'd really be MISSING from
> > REST frameworks is auto-generated documentation.
> >
> > aiohttp also supports websockets directly, so we could port the autobahn
> > work to use aiohttp.
> >
> > aiohttp can be run in-process in a thread. However, websocket
> > log-streaming is already a separate process for scaling purposes, so if
> > we decide that one impl backend is a value, it probably makes sense to
> > just stick the web tier in the websocket scaleout process anyway.
> >
> > However, we could probably write a facade layer with a gear backend and
> > an in-memory backend so that simple users could just run the in-process
> > version but scale-out was possible for larger installs (like us)
> >
> > Since aiohttp can be in-process, it also allows us to easily add some
> > '/health' endpoints to all of our services directly, even if they aren't
> > intended to be publicly consumable. That's important for running richly
> > inside of things like kubernetes that like to check in on health status
> > of services to know about rescheduling them. This way we could add a
> > simple thread to the scheduler and the executors and the mergers and the
> > nodepool launchers and builders that adds a '/health' endpoint.
> >
>
> See above. OpenStack has decided this is the wrong route to take
> (granted with eventlet and python2.7 not asyncio and python3.5). There
> are scaling and debugging challenges faced when you try to run an in
> process web server.
>
> > gRPC / gRPC-REST gateway
> >
> > This is a curve-ball. We could define our APIs using gRPC. That gets us
> > a story for an API that is easily consumable by all sorts of clients,
> > and that supports exciting things like bi-directional streaming
> > channels. gRPC isn't (yet) consumable directly in browsers, nor does
> > Github send gRPC webhooks. BUT - there is a REST Gateway for gRPC:
> >
> > https://github.com/grpc-ecosystem/grpc-gateway
> >
> > that generates HTTP/1.1+JSON interfaces from the gRPC descriptions and
> > translates between protobuf and json automatically. The "REST" interface
> > it produces does not support url-based parameters, so everything is done
> > in payload bodies, so it's:
> >
> > GET /nodes
> > {
> > 'id': 1234
> > }
> >
> > rather than
> >
> > GET /nodes/1234
> >
> > but that's still totally fine - and totally works for both status.json
> > and GH webhooks.
> >
> > The catch is - grpc-gateway is a grpc compiler plugin that generates
> > golang code. So we'd either have to write our own plugin that does the
> > same thing but for generating python code, or we'd have to write our
> > gRPC/REST layer in go. I betcha folks would appreciate if we implemented
> > the plugin for python, but that's a long tent-pole for this purpose so I
> > don't honestly think we should consider it. Therefore, we should
> > consider that using gRPC + gRPC-REST implies writing the web-tier in go.
> > That obviously implies an additional process that needs to talk over an
> > RPC bus.
> >
> > There are clear complexity costs involved with adding a second language
> > component, especially WRT deployment. (pip install zuul would not be
> > sufficient) OTOH - it would open the door to using protobuf-based
> > objects for internal communication, and would open the door for rich
> > client apps without REST polling and also potentially nice Android apps
> > (gRPC works great for mobile apps) I think that makes it a hard sell.
> >
> > THAT SAID - there are only 2 things that explicitly need REST over HTTP
> > 1.1 - thats the github webhooks and status.json. We could write
> > everything in gRPC except those two. Browser support for gRPC is coming
> > soon (they've moved from "someone is working on it" to "contact us about
> > early access") so status.json could move to being pure gRPC as well ...
> > and the webhook endpoint is pretty simple, so just having it be an
> > in-process aiohttp handler isn't a terrible cost. So if we thought
> > "screw it, let's just gRPC and not have an HTTP/1.1 REST interface at
> > all" - we can stay all in python and gRPC isn't a huge cost at that
> > point.
> >
> > gRPC doesn't handle websockets - but we could still run the gRPC serving
> > and the websocket serving out of the same scale-out web tier.
> >
>
> Another data point for gRPC is that the etcd3 work in OpenStack found
> that the existing python lib(s) for grpc don't play nice with eventlet
> or asyncio or anything that isn't Thread()
> (https://github.com/grpc/grpc/issues/6046 is the bug tracking that I
> think). This would potentially make the use of asyncio elsewhere
> (websockets) more complicated.
>
> > ** Summary
> >
> > Based on the three above, it seems like we need to think about separate
> > web-tier regardless of choice. The one option that doesn't strictly
> > require a separate tier is the one that lets us align on websockets, so
> > it seems that co-location there would be simple.
> >
> > aiohttp seems like the cleanest forward path. It'll require reworking
> > the autobahn code (sorry Shrews) - but is nicely aligned with our
> > Python3 state. It's new - but it's not as totally new as gRPC is. And
> > since we'll already have some websockets stuff, we could also write
> > streaming websockets APIs for the things where we'd want that from gRPC.
> >
> > * REST Framework
> >
> > If we decide to go the WSGI route, then we need to talk REST frameworks
> > (and it's possible we decide to go WSGI because we want to use a REST
> > framework)
> >
>
> I'm not sure I understand why the WSGI and REST frameworks are being
> conflated. You can do one or the other or both and whichever you choose
> shouldn't affect the other too much aiui. There is even a flask-aiohttp
> lib.
>
> > The assumption in this case is that the websocket layer is a separate
> > entity.
> >
> > There are three 'reasonable' options available:
> >
> > - pecan
> > - flask-restplus
> > - apistar
> >
> > pecan
> >
> > pecan is used in a lot of OpenStack services and is also used by
> > Storyboard, so it's well known. Tristan's patches so far use Pecan, so
> > we've got example code.
> >
> > On the other hand, Pecan seems to be mostly only used in OpenStack land
> > and hasn't gotten much adoption elsewhere.
> >
> > flask-restplus
> >
> > https://flask-restplus.readthedocs.io/en/stable/
> >
> > flask is extremely popular for folks doing REST in Python.
> > flask-restplus is a flask extension that also produces Swagger Docs for
> > the REST api, and provides for serving an interactive swagger-ui based
> > browseable interface to the API. You can also define models using
> > JSONSchema. Those are not needed for simple cases like status.json, but
> > for fuller REST API might be nice.
> >
> > Of course, in all cases we could simply document our API using swagger
> > and get the same thing - but that does involve maintaining model/api
> > descriptions and documentation separately.
> >
> > apistar
> >
> > https://github.com/tomchristie/apistar
> >
> > apistar is BRAND NEW and was announced at this year's PyCon. It's from
> > the Django folks and is aimed at writing REST separate from Django.
> >
> > It's python3 from scratch - although it's SO python3 focused that it
> > requires python 3.6. This is because it makes use of type annotations:
>
> Type hinting is in python 3.5 and apistar's trove identifer things
> mention 3.5 support (not sure if actually the case though). But if so
> 3.5 is far easier to use since it is in more distros than Arch and
> Tumbleweed (like with 3.6).
>
> >
> > def show_request(request: http.Request):
> > return {
> > 'method': request.method,
> > 'url': request.url,
> > 'headers': dict(request.headers)
> > }
> >
> > def create_project() -> Response:
> > data = {'name': 'new project', 'id': 123}
> > headers = {'Location': 'http://example.com/project/123/'}
> > return Response(data, status=201, headers=headers)
> >
> > and f'' strings:
> >
> > def echo_username(username):
> > return {'message': f'Welcome, {username}!'}
> >
> > Python folks seem to be excited about apistar so far - but I think
> > python 3.6 is a bridge too far - it honestly introduces more deployment
> > issues as doing a golang-gRPC layer.
> >
> > ** Summary
> >
> > I don't think the REST frameworks offer enough benefit to justify their
> > use and adopting WSGI as our path forward.
>
> Yesterday SpamapS mentioned wanting to be able to grow the Zuul
> community. Just based on looking at the choices OpenStack is making
> (moving TO wsgi) and the general populatity of Flask in the python
> community I think that you may want to consider both wsgi and flask
> simply because they are tools that are known to scale reasonably well
> and many people are familiar with them.
>
> >
> > ** Thoughts on RPC Bus **
> >
> > gearman is a simple way to add RPC calls between an API tier and the
> > scheduler. However, we got rid of gear from nodepool already, and we
> > intend on getting rid of gearman in v4 anyway.
> >
> > If we use zk, we'll have to do a little bit more thinking about how to
> > do the RPC calls which will make this take more work. BUT - it means we
> > can define one API that covers both Zuul and Nodepool and will be
> > forward compatible with a v4 no-gearman world.
> >
> > We *could* use gearman in zuul and run an API in-process in nodepool.
> > Then we could take a page out of early Nova and do a proxy-layer in zuul
> > that makes requests of nodepool's API.
> >
> > We could just assume that there's gonna be an Apache fronting this stuff
> > and suggest deployment with routing to zuul and nodepool apis with
> > mod_proxy rules.
> >
> > Finally, as clarkb pointed out in response to the ingestors spec, we
> > could introduce MQTT and use it. I'm wary of doing that for this because
> > it introduces a totally required new tech stack at a late stage.
>
> Mostly I was just pointing out that I think the vast majority of the
> infastructure work to have something like a zuul ingestor is done. You
> just have to read from an mqtt connection instead of a gerrit ssh
> connection. Granted this does require running more services (mqtt server
> and the event stream handler) and doesn't handle entities like Github.
>
> That said MQTT unlike Gearman seems to be seeing quite a bit of
> development activity due to the popularity of IoT. Gearman has worked
> reasonably well for us though so I don't think we need to just replace
> it to get in on the IoT bandwagon.
>
> >
> > Since we're starting fresh, I like the idea of a single API service that
> > RPCs to zuul and nodepool, so I like the idea of using ZK for the RPC
> > layer. BUT - using gear and adding just gear worker threads back to
> > nodepol wouldn't be super-terrible maybe.
>
> Nodepool hasn't had a Gearman less release yet so you don't have to
> worry about backward compat at least.
>
> >
> > ** Final Summary **
> >
> > As I tl;dr'd earlier, I think aiohttp co-located with the scale-out
> > websocket tier talking to the scheduler over zk is the best bet for us.
> > I think it's both simple enough to adopt and gets us a rich set of
> > features. It also lets us implement in-process simple health endpoints
> > on each service with the same tech stack.
>
> I'm wary of this simply because it looks a lot like repeating
> OpenStack's (now failed) decision to stick web servers in a bunch of
> python processes then do cooperative multithreading with them along with
> all your application logic. It just gets complicated. I also think this
> underestimates the value of using tools people are familiar with (wsgi
> and flask) particularly if making it easy to jump in and building
> community is a goal.
>
> Clark
>
>
> _______________________________________________
> OpenStack-Infra mailing list
> OpenStack-Infra at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
More information about the OpenStack-Infra
mailing list