[all] [kolla] Kolla Builder - next generation

Mark Goddard mark at stackhpc.com
Tue Apr 21 08:59:08 UTC 2020

On Sat, 18 Apr 2020 at 15:10, Radosław Piliszek
<radoslaw.piliszek at gmail.com> wrote:
> Hello fellow OpenStackers and Kollars (or Koalas) in particular,
> today is the day when I finally sit down to write up my thoughts on
> Kolla, and I mean *Kolla*, the container image building project for
> OpenStack and friends, nothing more (nor less).

Thanks for writing this up Radosław. It's good to engage in some open
ended proposals and discussion from time to time.

> ***
> Some background on Kolla to get everyone on the same page (or close enough):
> Kolla builds images for production use. Kolla is upstream for TripleO
> containers and Kolla-Ansible deployment.
> These are container images, think OCI and Docker in particular since
> Kolla actually relies on Dockerfile format to specify build recipes
> and Docker runs building (TripleO runs buildah on the same recipes).
> Kolla supports three distributions as bases: CentOS (TripleO does
> override this for RHEL as well), Debian and Ubuntu.
> In the interim periods Kolla supports two distribution releases to
> ease/smooth the transition process for operators (like currently
> CentOS 7 and 8 in Train, while Ussuri is 8 only).
> What is more, each distro has two flavours (or 'types' as they are
> called for now): binary and source.
> This 'type' applies only to OpenStack software. Binary means that
> Kolla uses downstream packages (rpm/deb) while source means to use pip
> and install from official tarballs (or repos if it is master) and PyPI
> for deps, utilising venv to keep it separate from distro stuff.
> Finally, there is particular target architecture: x86_64, aarch64 and ppc64le.
> All of above affect the support matrix which is based on 3 or 4
> dimensions (distro, flavour, arch and sometimes distro version). [1]
> Kolla offers a high level of customisability via various overrides
> levels. See docs for details [2]
> Kolla engine offers hierarchical approach to image building, under the
> assumption that more than one image is often deployed on the same
> machine so layer sharing is beneficial.

It also makes the build more efficient if you build multiple images at once.

> Kolla helps collect sources for OpenStack projects to build desired
> versions of them.
> All recipes are templated using Jinja2 syntax.
> Images contain both run-time and, mostly in the case of 'source'
> flavour, build-time tools and libraries.
> *** That's it for the background. :-)
> So where is the problem you might ask? Oh, there are plenty.
> The general is that Kolla has lots of logic in templates.
> Thankfully, Kolla has macros for most stuff but still Jinja2
> limitations make it hard to document exceptions (no inline comments in
> arrays anyone?).
> We lose visibility into real dependency graphs and may be easily
> reinstalling same stuff, any optimisations are limited and would
> require parsing of both Jinja2 and Dockerfile syntax.
> In general the current approach is ugly and becomes unwieldy (e.g.
> getting warnings about empty continuation lines).
> There is layering hell: dependency on long &&ed commands which are
> templated out to avoid useless layers.
> Support matrix is great for an overview.
> There is a hidden layer to it though, different combinations may
> support different features.
> This is not documented so far and not so easy to follow from sources.
> This stems from the fact that different extra components have
> different availability in distros, but this is not easily apparent
> from sources.
> Also, there is no way to turn feature on/off. You might do an
> override, but then again, which to keep, which not to? Which is which?
> Oh!
> Finally, the images tend to be heavy due to inclusion of build-time deps.

My multi-stage build PoC [1] suggests about 150MB could be saved for
source images. With shared image layers we only get that saving once,
in openstack-base, so this would have most effect in an environment
where images are built individually at different times, and may not
share layers.

> *** That's it for issues (hopefully!)
> My idea is to reuse Kolla engine where it shines: sources collection,
> system of plugins, hierarchical building; but replace the part that
> smells - Jinja2 templating.
> Giving up on Jinja2 might encompass giving up on Dockerfile syntax,
> but that is optional and depends on what makes it more amenable to
> avoid further pitfalls.
> With giving up on Jinja2, the idea is to generate building recipes
> fully programmatically from Python.

There would be some nice advantages to this, not least it would be
Python rather than Jinja. Also it would be nicely unit testable.

However, Jinja does give us some nice properties. Dockerfiles are
relatively WYSIWIG, and this keeps the barrier to contribution low.
Often the most that needs to be done to help a contributor get started
with kolla is to say 'here is the Dockerfile.j2 for the image you want
to change'.

As Sean rightly pointed out, we have wedded ourselves to Jinja by
exposing it as a customisation point, through blocks and overrides.
We'd need to replicate that.

A potential substitute for unit testing could be to generate every
supported combination of Dockerfiles, and keep them in the repository.
These would need to be updated with each change, but would allow us to
see the final content of the images for each distro. Even better than
this would be to use TestInfra [2] to validate various expectations
about image content (e.g. virtualenv uses py3.6, package foo is

> It would be possible to introduce "Features" - sets of packages to
> install based on the (distro, arch) tuple.
> This would result in more flexibility - turning them on/off (some
> could be optional, some not).
> There could be more than one optimization strategy regarding when
> packages get installed: you want only standalone blah-blah? Then Kolla
> won't be installing XYZ and ABC just because ugma-ugma and
> tickle-tickle require them and you "could save some space" (TM).
> In the same vein, features could declare which components are
> build-time and which are run-time and this would make it
> straightforward to separate the sides.

This could be neat, although it would be difficult to test and keep
working, and would only increase our support matrix. You could do it
with Jinja, although it might get a bit unwieldy.

> The above effort could well be coordinated with different projects to
> reuse bindep contents. So far Kolla does not use bindep because it
> often installs too much and not enough at the same time.
> Do note it would still be bindep-less for external services.

I like the idea of using bindep for source images, in combination with
multi-stage builds. If we could reduce the length of our package
dependency lists by relying on bindep, that would be a nice win.
Ideally, between distro package dependencies and bindep, we would need
to specify few additional packages.

> There would still need to exist a general mechanism for providing
> custom command executions required by some images.
> For contributors and cores this new approach would bring more sanity
> as to the scope of proposed changes.
> Also, it would be possible to get quick insight into feature support
> and autogenerate docs for that as well.
> Similarly, current concept of unbuildable images would no longer be
> required because unbuildability would be dictated by lack of support
> for a required feature.

I'm not sure what you mean by 'feature' here, but I'd say the most
common reason an image is unbuildable currently is that the main
package required by the image is unavailable, rather than some
ancillary package required for a particular feature. Modifying
unbuildable images to include unavailable features could be
interesting, but would be an extension of current behaviour.

> ***
> Looking forward to your opinions/thoughts.

I think there are some good ideas in here, but I feel that the
cost/benefit of the core proposal to replace Jinja with Python doesn't
work for me. Specifically, the cost would be high in terms of being a
significant rewrite of every image and building additional tooling,
plus the operator headaches of switching to the new customisation
model. In terms of benefits, it seems they are mostly for contributors
rather than users.

I'd like us to explore some of the pain points raised here, and also
see if we can determine any others through the kolla klub.

* Can we reduce the image size? Are we installing unnecessary
packages? (spoiler, yes) Can we use multi stage builds?
* Can we improve testing, to make it clearer what the effects of
changing a particular image would be?
* What would a 'feature' look like with our current tooling? I'd like
to see a concrete example.
* What can we learn from the proposed goal [3] to add container images
for each project? Could kolla be used in a more distributed manner
more amenable to a CI/CD pattern where each project publishes its own
images? Could we add a 'python' base distro that is based on the
python:3-slim image? I realise the authors of the goal won't like
this, but it does add some missing flexibility to their proposal.

[1] https://review.opendev.org/#/c/631647/
[2] https://testinfra.readthedocs.io/en/latest/
[3] https://review.opendev.org/#/c/720107/

> ***
> [1] https://docs.openstack.org/kolla/train/support_matrix.html
> [2] https://docs.openstack.org/kolla/train/admin/image-building.html
> -yoctozepto

More information about the openstack-discuss mailing list