[OpenStack-Infra] Zuul v3: Giant pile of thoughts around zuul-web things

Monty Taylor mordred at inaugust.com
Wed Jul 26 17:50:48 UTC 2017


Based on having written the puppet to support the apache proxying to 
both github/status pages and the console-streaming, I believe we should 
accelerate moving the functions from the old webapp to the new zuul-web.

While the apache proxy layer keeps the moves from breaking our 
end-users, doing the moves will break anyone doing deployments of zuul 
CD from git unless we tightly coordinate with them. As that's the way we 
prefer to run normally in Infra, it's reasonable to expect other people 
to desire running that way as well. While we can, in Infra, be careful 
when we land changes and then update the puppet url mapping before we 
restart the services, once there is another deployer it's hard to 
coordinate. So the web functions in the scheduler should be moved to 
zuul-web at the very least before we cut v3. I'd prefer to just go ahead 
and move them before Denver, because I'd rather not do the puppet dance 
after we go live.

I've written a first pass at doing this that doesn't move anything but 
that replaces webob with aiohttp. I did this when jlk was having issues 
with unicode in the github webhook out of curiosity and it wasn't that 
bad. I'll push that up in just a sec. Before we get to patches though, I 
keep saying I'm going to write up thoughts on how this wants to look in 
the end - so here they are:

The shape of what things want to look like
------------------------------------------

zuul-web wants to serve all http and websocket content. It's a stateless 
app, so it can be simply scaled-out and load-balanced as desired.

At the moment the http content is split between zuul-web and webapp and 
the urls need to be rewritten/proxied to be contiguous.

static assets want to be in a '/static' path and served by zuul-web - 
but all prefixed with '/static' so that it's easy for installations to 
serve static content directly with apache. The path "static" can 
certainly be debated - but it's working for now. The main point is that 
we want a portion of the URL space to be easily servable by a static web 
server for large installs, but also easily servable directly by zuul-web 
for small ones.

It needs to be possible to run zuul-web on a sub-url, such as 
https://ci.example.com/zuul - We should endeavor to properly detect this 
from the appropriate http headers which we are not doing right now, but 
we should continue to provide a config setting so that admin can set it 
if for some reason we can't detect it properly from headers.

REST endpoints fall in to two categories - global and per-tenant. Most 
urls should be per-tenant, but some, like the github webhook listener, 
need to be global.

The url scheme served by zuul-web should be:

   /connection/{connection}
   /tenant/{tenant}

so -

   /connection/github/payload
   /tenant/openstack/status

** Question - should we shift them all down a tick to be under /api or 
similar?

The tenant namespace has the follwing endpoints:

   /tenant/{tenant}/console-stream
   /tenant/{tenant}/status
   /tenant/{tenant/{status}/change/{change}
   /tenant/{tenant}/keys/{source}{project}.pub

It needs to be possible to expose the websocket stream on a different 
url, such as zuul-web being on https://ci.example.com/zuul and websocket 
being on https://internal.example.com/zuul/console-stream.

We need to be able to allow an admin to configure an override for the 
websocket location (this is already done)

Server-side rendering of html should be avoided, preferring instead 
serving html, css, images and javascript files from /static and having 
those fetch information from api endpoints on zuul-web. (more on this later)

For now the html/javascript web-content is in the zuul repo. Depending 
on how we wind up feeling about javascript tooling we may want to 
discuss splitting it into its own repo - but it's not actually necessary 
to do so. (again, more on javascript tooling later)

Actually moving things from scheduler to zuul-web
-------------------------------------------------

- make both a register-per-tenant and a register-global function for 
registering handlers

Most places want to register per-tenant and only per-tenant. but the 
github webhook wants to be global, so we at least need the ability. It's 
also POSSIBLE an admin might want to make an admin status console. We 
should not implement that until we have an actual need for it - but 
having the ability at registration time is important.

- moving status to zuul-web

We will need a gearman function on the scheduler for getting the status 
info. We can just ship the entire status info from the scheduler to 
zuul-web and do caching/filtering there. Should be fairly trivial.

- move github webhook to zuul-web

Will need a gearman function in the github driver - essentially 
splitting the current webhook handler into the thing that receives the 
webhook and the thing that processes the webhook. The github plugin code 
is already arranged that way, so putting a gearman call in the middle 
should be DEAD simple. It just needs to plop the headers and json it 
gets from the webhook into the gearman payload. However, this will need:

- add plugin plumbing to zuul-web

The github plugin will need to register routes with the zuul-web process 
and not the scheduler. zuul-web should load drivers and connections (as 
does the scheduler and merger/executor) and its drivers can register the 
needed methods. Shouldn't be terrible.

The github plugin will also need to register a gearman worker function, 
so we probably want to add a registration function to the connection 
plugin interface to allow plugins to add worker functions.

- move keys endpoint to zuul-web

Same as the other two - gearman function to fetch the key data from the 
scheduler.

So in all - this is:

* convert scheduler/github http things to aiohttp in place
* add zuul-web plugin plumbing
* make three gearman functions
* move the web handlers to zuul-web and have them call the gearman functions

Javascript toolchain
--------------------

tl;dr - we should use yarn and webpack. I've got patches written.

We're growing web things. In addition to the status page, we now have 
the console streaming page. We have a desire to add something that knows 
how to render the job-output.json in some nice ways. And Tristan has 
written a nice Dashboard for historical data based on the SQL Reporter.

Basically - there is a Web UI, and it feels like it's time to go ahead 
and start managing it like an application. (we could even start 
unit-testing!)

Considerations for selecting javascript/css management tools:

* must be easy to understand for zuul devs / not assume deep 
understanding of the node ecosystem
* must not rely on 'vendoring' libraries into our source trees
* should be friendly for folks working from a distro POV
* should support minification/packing

 From looking in to things, all of the current/recommended tooling 
assumes a recent node install- just as pretty much all python things 
assume recent pip.

In Infra land we have a python/javascript app already, storyboard.

storyboard currently uses a combination of bower, npm and grunt. bower 
is for downloading non-javscript depends, npm for javscript, and grunt 
for building which includes minification and updating files to know how 
to find the downloaded dependencies.

The bower folks now recommending not using bower anymore in favor of 
yarn. yarn also handles the npm side, so it's no longer bower+npm - it's 
just yarn.

We don't have a ton of complex tasks we need to do - which is where 
things like gulp or grunt come in - we mostly need to take javascript, 
css and html, combine/minify/transpile and publish. The tool 'webpack' 
is recommended for this - and has the benefit of an actually 
legitimately excellent set of intro docs. (I read them in not too much 
time and actually wrote webpack does WITHOUT just complete copy-pasta)

yarn provides distro package repos for ubuntu/debian, 
centos/fedora/rhel, arch, solus and alpine making it pretty easy to get 
started with for those of us used to that flow. HOWEVER - many modules 
seem to want node >= 5 and xenial only has 4 - so the apt repo from 
nodesource also needs to be installed.

yarn then installs things locally into the source dir based on what's in 
package.json (and can manage the entries in that file as well), and 
writes a yarn.lock that can be committed to git. package.json and 
yarn.lock is similar to rust's cargo.toml and cargo.lock file. 
package.json records the logical depends - yarn.lock records exact 
versions including transitive depends so that builds are reproducible 
without needing to save copies of that code.

webpack itself is just another depend installed via yarn like any of the 
rest of the javascript depends.

webpack handles taking the code yarn downloads and "building" it. it 
provides support for "import" statements that are supported directly in 
newer javascript but not older, and it transpiles the code at build time 
so that the code works in older browsers. This means that one can write 
code that just does:

   import _ from 'lodash';

In the javascript. It has similar constructs for css and the images even.

One of the other nice things is that it includes support for a local dev 
server with file watches and hot updating of changed code - this means 
you can load the status page in the dev server, change the css in a 
source code file and have it immediately show up in the live page 
without even having to refresh the page.

Some things require a page refresh - the jquery plugin for status is an 
example - and when you change those things the page gets refreshed for you.

These tools are only needed for build/dev tooling, not for running 
anything. Just like storyboard, the story for production is to produce a 
tarball of html/javascript artifacts and then upload those to the webserver.

However, this tooling is still potentially an issue at the distro 
packaging level. I'm going to go out on a limb (what with me working for 
a distro and all) and say that the tooling makes things nice enough that 
we should do this 'right' from a web community POV and figure out the 
packaging fallout. I've sent an email internally at RH to learn more 
about what the story is for dealing with Javascript ecosystem things, 
I'll let folks know what I learn.

Both are solid and not too hard to understand. I wrote patches 
supporting both status and streaming - and it honestly made it easier 
for me to hack on the javascript, test it live and debug issues when 
they arose. Given that there are also unittesting frameworks- I'm 
personally completely sold.



More information about the OpenStack-Infra mailing list