[OpenStack-Infra] Zuul v3: Giant pile of thoughts around zuul-web things

Tristan Cacqueray tdecacqu at redhat.com
Thu Jul 27 07:22:49 UTC 2017

On July 26, 2017 5:50 pm, Monty Taylor wrote:
> Based on having written the puppet to support the apache proxying to 
> both github/status pages and the console-streaming, I believe we should 
> accelerate moving the functions from the old webapp to the new zuul-web.
> While the apache proxy layer keeps the moves from breaking our 
> end-users, doing the moves will break anyone doing deployments of zuul 
> CD from git unless we tightly coordinate with them. As that's the way we 
> prefer to run normally in Infra, it's reasonable to expect other people 
> to desire running that way as well. While we can, in Infra, be careful 
> when we land changes and then update the puppet url mapping before we 
> restart the services, once there is another deployer it's hard to 
> coordinate. So the web functions in the scheduler should be moved to 
> zuul-web at the very least before we cut v3. I'd prefer to just go ahead 
> and move them before Denver, because I'd rather not do the puppet dance 
> after we go live.
> I've written a first pass at doing this that doesn't move anything but 
> that replaces webob with aiohttp. I did this when jlk was having issues 
> with unicode in the github webhook out of curiosity and it wasn't that 
> bad. I'll push that up in just a sec. Before we get to patches though, I 
> keep saying I'm going to write up thoughts on how this wants to look in 
> the end - so here they are:
> The shape of what things want to look like
> ------------------------------------------
> zuul-web wants to serve all http and websocket content. It's a stateless 
> app, so it can be simply scaled-out and load-balanced as desired.
> At the moment the http content is split between zuul-web and webapp and 
> the urls need to be rewritten/proxied to be contiguous.
> static assets want to be in a '/static' path and served by zuul-web - 
> but all prefixed with '/static' so that it's easy for installations to 
> serve static content directly with apache. The path "static" can 
> certainly be debated - but it's working for now. The main point is that 
> we want a portion of the URL space to be easily servable by a static web 
> server for large installs, but also easily servable directly by zuul-web 
> for small ones.
That sounds good, from a packaging perspective, it's better if the
html/js can be embedded and served by the zuul-web service.

> It needs to be possible to run zuul-web on a sub-url, such as 
> https://ci.example.com/zuul - We should endeavor to properly detect this 
> from the appropriate http headers which we are not doing right now, but 
> we should continue to provide a config setting so that admin can set it 
> if for some reason we can't detect it properly from headers.
For what it worth, the ProxyPass directive already works with sub-url,
here is my current settings:
ProxyPass /zuul3/static/ http://localhost:{{ web_port }}/static/ nocanon
ProxyPass /zuul3/console-stream ws://localhost:{{ web_port }}/console-stream nocanon
ProxyPass /zuul3/ http://localhost:{{ scheduler_port }}/ nocanon

> REST endpoints fall in to two categories - global and per-tenant. Most 
> urls should be per-tenant, but some, like the github webhook listener, 
> need to be global.
> The url scheme served by zuul-web should be:
>    /connection/{connection}
>    /tenant/{tenant}
> so -
>    /connection/github/payload
>    /tenant/openstack/status
> ** Question - should we shift them all down a tick to be under /api or 
> similar?
How about a /v3 prefix?

> The tenant namespace has the follwing endpoints:
>    /tenant/{tenant}/console-stream
>    /tenant/{tenant}/status
>    /tenant/{tenant/{status}/change/{change}
>    /tenant/{tenant}/keys/{source}{project}.pub
> It needs to be possible to expose the websocket stream on a different 
> url, such as zuul-web being on https://ci.example.com/zuul and websocket 
> being on https://internal.example.com/zuul/console-stream.
> We need to be able to allow an admin to configure an override for the 
> websocket location (this is already done)
> Server-side rendering of html should be avoided, preferring instead 
> serving html, css, images and javascript files from /static and having 
> those fetch information from api endpoints on zuul-web. (more on this later)
> For now the html/javascript web-content is in the zuul repo. Depending 
> on how we wind up feeling about javascript tooling we may want to 
> discuss splitting it into its own repo - but it's not actually necessary 
> to do so. (again, more on javascript tooling later)
If zuul-web is going to serve the web-content, then it sounds better to
keep them in the zuul repo.
> Actually moving things from scheduler to zuul-web
> -------------------------------------------------
> - make both a register-per-tenant and a register-global function for 
> registering handlers
> Most places want to register per-tenant and only per-tenant. but the 
> github webhook wants to be global, so we at least need the ability. It's 
> also POSSIBLE an admin might want to make an admin status console. We 
> should not implement that until we have an actual need for it - but 
> having the ability at registration time is important.
> - moving status to zuul-web
> We will need a gearman function on the scheduler for getting the status 
> info. We can just ship the entire status info from the scheduler to 
> zuul-web and do caching/filtering there. Should be fairly trivial.
> - move github webhook to zuul-web
> Will need a gearman function in the github driver - essentially 
> splitting the current webhook handler into the thing that receives the 
> webhook and the thing that processes the webhook. The github plugin code 
> is already arranged that way, so putting a gearman call in the middle 
> should be DEAD simple. It just needs to plop the headers and json it 
> gets from the webhook into the gearman payload. However, this will need:
> - add plugin plumbing to zuul-web
> The github plugin will need to register routes with the zuul-web process 
> and not the scheduler. zuul-web should load drivers and connections (as 
> does the scheduler and merger/executor) and its drivers can register the 
> needed methods. Shouldn't be terrible.
> The github plugin will also need to register a gearman worker function, 
> so we probably want to add a registration function to the connection 
> plugin interface to allow plugins to add worker functions.
> - move keys endpoint to zuul-web
> Same as the other two - gearman function to fetch the key data from the 
> scheduler.
> So in all - this is:
> * convert scheduler/github http things to aiohttp in place
> * add zuul-web plugin plumbing
> * make three gearman functions
> * move the web handlers to zuul-web and have them call the gearman functions
The sql reporter could also use REST endpoints to expose the
build/buildset information, that would add:


> Javascript toolchain
> --------------------
> tl;dr - we should use yarn and webpack. I've got patches written.
> We're growing web things. In addition to the status page, we now have 
> the console streaming page. We have a desire to add something that knows 
> how to render the job-output.json in some nice ways. And Tristan has 
> written a nice Dashboard for historical data based on the SQL Reporter.
> Basically - there is a Web UI, and it feels like it's time to go ahead 
> and start managing it like an application. (we could even start 
> unit-testing!)
And selenium testing too!

> Considerations for selecting javascript/css management tools:
> * must be easy to understand for zuul devs / not assume deep 
> understanding of the node ecosystem
> * must not rely on 'vendoring' libraries into our source trees
> * should be friendly for folks working from a distro POV
> * should support minification/packing
>  From looking in to things, all of the current/recommended tooling 
> assumes a recent node install- just as pretty much all python things 
> assume recent pip.
> In Infra land we have a python/javascript app already, storyboard.
> storyboard currently uses a combination of bower, npm and grunt. bower 
> is for downloading non-javscript depends, npm for javscript, and grunt 
> for building which includes minification and updating files to know how 
> to find the downloaded dependencies.
> The bower folks now recommending not using bower anymore in favor of 
> yarn. yarn also handles the npm side, so it's no longer bower+npm - it's 
> just yarn.
> We don't have a ton of complex tasks we need to do - which is where 
> things like gulp or grunt come in - we mostly need to take javascript, 
> css and html, combine/minify/transpile and publish. The tool 'webpack' 
> is recommended for this - and has the benefit of an actually 
> legitimately excellent set of intro docs. (I read them in not too much 
> time and actually wrote webpack does WITHOUT just complete copy-pasta)
> yarn provides distro package repos for ubuntu/debian, 
> centos/fedora/rhel, arch, solus and alpine making it pretty easy to get 
> started with for those of us used to that flow. HOWEVER - many modules 
> seem to want node >= 5 and xenial only has 4 - so the apt repo from 
> nodesource also needs to be installed.
> yarn then installs things locally into the source dir based on what's in 
> package.json (and can manage the entries in that file as well), and 
> writes a yarn.lock that can be committed to git. package.json and 
> yarn.lock is similar to rust's cargo.toml and cargo.lock file. 
> package.json records the logical depends - yarn.lock records exact 
> versions including transitive depends so that builds are reproducible 
> without needing to save copies of that code.
> webpack itself is just another depend installed via yarn like any of the 
> rest of the javascript depends.
> webpack handles taking the code yarn downloads and "building" it. it 
> provides support for "import" statements that are supported directly in 
> newer javascript but not older, and it transpiles the code at build time 
> so that the code works in older browsers. This means that one can write 
> code that just does:
>    import _ from 'lodash';
> In the javascript. It has similar constructs for css and the images even.
> One of the other nice things is that it includes support for a local dev 
> server with file watches and hot updating of changed code - this means 
> you can load the status page in the dev server, change the css in a 
> source code file and have it immediately show up in the live page 
> without even having to refresh the page.
> Some things require a page refresh - the jquery plugin for status is an 
> example - and when you change those things the page gets refreshed for you.
> These tools are only needed for build/dev tooling, not for running 
> anything. Just like storyboard, the story for production is to produce a 
> tarball of html/javascript artifacts and then upload those to the webserver.
> However, this tooling is still potentially an issue at the distro 
> packaging level. I'm going to go out on a limb (what with me working for 
> a distro and all) and say that the tooling makes things nice enough that 
> we should do this 'right' from a web community POV and figure out the 
> packaging fallout. I've sent an email internally at RH to learn more 
> about what the story is for dealing with Javascript ecosystem things, 
> I'll let folks know what I learn.
From my limited packaging experience, the trick is to avoid bundling and
instead re-use already packaged libraries.

For example, the current architecture is relatively simple to package,
the static content can be minified using rcssmin/jsmin and
we can modify the index.html to uses python-XStatic-Bootstrap-SCSS and
python-XStatic-jQuery provided by RDO.

On the other hand, bower/grunt/npm systems are more difficult to
integrate because to make the final dist files, those systems usualy
fetch many dependencies dynamically which is harder to integrate in a
clean package that can be rebuild without Internet access.

Ideally, it should be possible to assemble the zuul web-content
independently from the external dependencies. Similarly to the tox
requirements.txt, external bits could be defined/locked for dev 
environment, as long as it's easy for packager to replace those
dependencies by distro versions.


> Both are solid and not too hard to understand. I wrote patches 
> supporting both status and streaming - and it honestly made it easier 
> for me to hack on the javascript, test it live and debug issues when 
> they arose. Given that there are also unittesting frameworks- I'm 
> personally completely sold.
