[TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes
bdobreli at redhat.com
Tue Nov 27 15:24:44 UTC 2018
Changing the topic to follow the subject.
[tl;dr] it's time to rearchitect container images to stop incluiding
config-time only (puppet et al) bits, which are not needed runtime and
pose security issues, like CVEs, to maintain daily.
1) For the Distributed Compute Node edge case, there is potentially tens
of thousands of a single-compute-node remote edge sites connected over
WAN to a single control plane, which is having high latency, like a
100ms or so, and limited bandwith. Reducing the base layer size becomes
a decent goal there. See the security background below.
2) For a generic security (Day 2, maintenance) case, when
puppet/ruby/systemd/name-it gets a CVE fixed, the base layer has to be
updated and all layers on top - to be rebuild, and all of those layers,
to be re-fetched for cloud hosts and all containers to be restarted...
And all of that because of some fixes that have nothing to OpenStack. By
the remote edge sites as well, remember of "tens of thousands", high
latency and limited bandwith?..
3) TripleO CI updates (including puppet*) packages in containers, not in
a common base layer of those. So each a CI job has to update puppet* and
its dependencies - ruby/systemd as well. Reducing numbers of packages to
update for each container makes sense for CI as well.
WIP patches , for early review, uses a config "pod" approach, does
not require to maintain a two sets of config vs runtime images. Future
work: a) cronie requires systemd, we'd want to fix that also off the
base layer. b) rework to podman pods for docker-puppet.py instead of
--volumes-from a side car container (can't be backported for Queens
then, which is still nice to have a support for the Edge DCN case, at
least downstream only perhaps).
Some questions raised on IRC:
Q: is having a service be able to configure itself really need to
involve a separate pod?
A: Highly likely yes, removing not-runtime things is a good idea and
pods is an established PaaS paradigm already. That will require some
changes in the architecture though (see the topic with WIP patches).
Q: that's (fetching a config container) actually more data that about to
A: It's not, if thinking of Day 2, when have to re-fetch the base layer
and top layers, when some unrelated to openstack CVEs got fixed there
for ruby/puppet/systemd. Avoid the need to restart service containers
because of those minor updates puched is also a nice thing.
Q: the best solution here would be using packages on the host,
generating the config files on the host. And then having an all-in-one
container for all the services which lets them run in an isolated mannner.
A: I think for Edge cases, that's a no go as we might want to consider
tiny low footprint OS distros like former known Container Linux or
Atomic. Also, an all-in-one container looks like an anti-pattern from
the world of VMs.
> Here is a related bug  and implementation  for that. PTAL folks!
>  https://bugs.launchpad.net/tripleo/+bug/1804822
>  https://review.openstack.org/#/q/topic:base-container-reduction
>> Let's also think of removing puppet-tripleo from the base container.
>> It really brings the world-in (and yum updates in CI!) each job and each
>> So if we did so, we should then either install puppet-tripleo and co on
>> the host and bind-mount it for the docker-puppet deployment task steps
>> (bad idea IMO), OR use the magical --volumes-from <a-side-car-container>
>> option to mount volumes from some "puppet-config" sidecar container
>> inside each of the containers being launched by docker-puppet tooling.
> On Wed, Oct 31, 2018 at 11:16 AM Harald Jensås <hjensas at redhat.com>
>> We add this to all images:
>> /bin/sh -c yum -y install iproute iscsi-initiator-utils lvm2 python
>> socat sudo which openstack-tripleo-common-container-base rsync cronie
>> crudini openstack-selinux ansible python-shade puppet-tripleo python2-
>> kubernetes && yum clean all && rm -rf /var/cache/yum 276 MB
>> Is the additional 276 MB reasonable here?
>> openstack-selinux <- This package run relabling, does that kind of
>> touching the filesystem impact the size due to docker layers?
>> Also: python2-kubernetes is a fairly large package (18007990) do we use
>> that in every image? I don't see any tripleo related repos importing
>> from that when searching on Hound? The original commit message
>> adding it states it is for future convenience.
>> On my undercloud we have 101 images, if we are downloading every 18 MB
>> per image thats almost 1.8 GB for a package we don't use? (I hope it's
>> not like this? With docker layers, we only download that 276 MB
>> transaction once? Or?)
>>  https://review.openstack.org/527927
> Best regards,
> Bogdan Dobrelya,
> Irc #bogdando
More information about the openstack-discuss