[openstack-dev] [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Dan Prince dprince at redhat.com
Wed Nov 28 13:58:45 UTC 2018


On Wed, 2018-11-28 at 12:45 +0100, Bogdan Dobrelya wrote:
> To follow up and explain the patches for code review:
> 
> The "header" patch https://review.openstack.org/620310 -> (requires) 
> https://review.rdoproject.org/r/#/c/17534/, and also 
> https://review.openstack.org/620061 -> (which in turn requires)
> https://review.openstack.org/619744 -> (Kolla change, the 1st to go)
> https://review.openstack.org/619736

This email was cross-posted to multiple lists and I think we may have
lost some of the context in the process as the subject was changed.

Most of the suggestions and patches are about making our base
container(s) smaller in size. And the means by which the patches do
that is to share binaries/applications across containers with custom
mounts/volumes. I've -2'd most of them. What concerns me however is
that some of the TripleO cores seemed open to this idea yesterday on
IRC. Perhaps I've misread things but what you appear to be doing here
is quite drastic I think we need to consider any of this carefully
before proceeding with any of it.


> 
> Please also read the commit messages, I tried to explain all "Whys"
> very 
> carefully. Just to sum up it here as well:
> 
> The current self-containing (config and runtime bits) architecture
> of 
> containers badly affects:
> 
> * the size of the base layer and all containers images as an
>    additional 300MB (adds an extra 30% of size).

You are accomplishing this by removing Puppet from the base container,
but you are also creating another container in the process. This would
still be required on all nodes as Puppet is our config tool. So you
would still be downloading some of this data anyways. Understood your
reasons for doing this are that it avoids rebuilding all containers
when there is a change to any of these packages in the base container.
What you are missing however is how often is it the case that Puppet is
updated that something else in the base container isn't?

I would wager that it is more rare than you'd think. Perhaps looking at
the history of an OpenStack distribution would be a valid way to assess
this more critically. Without this data to backup the numbers I'm
afraid what you are doing here falls into "pre-optimization" territory
for me and I don't think the means used in the patches warrent the
benefits you mention here.


> * Edge cases, where we have containers images to be distributed, at
>    least once to hit local registries, over high-latency and limited
>    bandwith, highly unreliable WAN connections.
> * numbers of packages to update in CI for all containers for all
>    services (CI jobs do not rebuild containers so each container gets
>    updated for those 300MB of extra size).

It would seem to me there are other ways to solve the CI containers
update problems. Rebuilding the base layer more often would solve this
right? If we always build our service containers off of a base layer
that is recent there should be no updates to the system/puppet packages
there in our CI pipelines.

> * security and the surface of attacks, by introducing systemd et al
> as
>    additional subjects for CVE fixes to maintain for all containers.

We aren't actually using systemd within our containers. I think those
packages are getting pulled in by an RPM dependency elsewhere. So
rather than using 'rpm -ev --nodeps' to remove it we could create a
sub-package for containers in those cases and install it instead. In
short rather than hack this to remove them why not pursue a proper
packaging fix?

In general I am a fan of getting things out of the base container we
don't need... so yeah lets do this. But lets do it properly.

> * services uptime, by additional restarts of services related to
>    security maintanence of irrelevant to openstack components sitting
>    as a dead weight in containers images for ever.

Like I said above how often is it that these packages actually change
where something else in the base container doesn't? Perhaps we should
get more data here before blindly implementing a solution we aren't
sure really helps out in the real world.

> 
> On 11/27/18 4:08 PM, Bogdan Dobrelya wrote:
> > Changing the topic to follow the subject.
> > 
> > [tl;dr] it's time to rearchitect container images to stop
> > incluiding 
> > config-time only (puppet et al) bits, which are not needed runtime
> > and 
> > pose security issues, like CVEs, to maintain daily.
> > 
> > Background: 1) For the Distributed Compute Node edge case, there
> > is 
> > potentially tens of thousands of a single-compute-node remote edge
> > sites 
> > connected over WAN to a single control plane, which is having high 
> > latency, like a 100ms or so, and limited bandwith.
> > 2) For a generic security case,
> > 3) TripleO CI updates all
> > 
> > Challenge:
> > 
> > > Here is a related bug [1] and implementation [1] for that. PTAL
> > > folks!
> > > 
> > > [0] https://bugs.launchpad.net/tripleo/+bug/1804822
> > > [1] 
> > > https://review.openstack.org/#/q/topic:base-container-reduction
> > > 
> > > > Let's also think of removing puppet-tripleo from the base
> > > > container.
> > > > It really brings the world-in (and yum updates in CI!) each job
> > > > and 
> > > > each container!
> > > > So if we did so, we should then either install puppet-tripleo
> > > > and co 
> > > > on the host and bind-mount it for the docker-puppet deployment
> > > > task 
> > > > steps (bad idea IMO), OR use the magical --volumes-from 
> > > > <a-side-car-container> option to mount volumes from some 
> > > > "puppet-config" sidecar container inside each of the containers
> > > > being 
> > > > launched by docker-puppet tooling.
> > > 
> > > On Wed, Oct 31, 2018 at 11:16 AM Harald Jensås <hjensas at
> > > redhat.com> 
> > > wrote:
> > > > We add this to all images:
> > > > 
> > > > https://github.com/openstack/tripleo-common/blob/d35af75b0d8c4683a677660646e535cf972c98ef/container-images/tripleo_kolla_template_overrides.j2#L35 
> > > > 
> > > > 
> > > > /bin/sh -c yum -y install iproute iscsi-initiator-utils lvm2
> > > > python
> > > > socat sudo which openstack-tripleo-common-container-base rsync
> > > > cronie
> > > > crudini openstack-selinux ansible python-shade puppet-tripleo
> > > > python2-
> > > > kubernetes && yum clean all && rm -rf /var/cache/yum 276 MB
> > > > Is the additional 276 MB reasonable here?
> > > > openstack-selinux <- This package run relabling, does that kind
> > > > of
> > > > touching the filesystem impact the size due to docker layers?
> > > > 
> > > > Also: python2-kubernetes is a fairly large package (18007990)
> > > > do we use
> > > > that in every image? I don't see any tripleo related repos
> > > > importing
> > > > from that when searching on Hound? The original commit
> > > > message[1]
> > > > adding it states it is for future convenience.
> > > > 
> > > > On my undercloud we have 101 images, if we are downloading
> > > > every 18 MB
> > > > per image thats almost 1.8 GB for a package we don't use? (I
> > > > hope it's
> > > > not like this? With docker layers, we only download that 276 MB
> > > > transaction once? Or?)
> > > > 
> > > > 
> > > > [1] https://review.openstack.org/527927
> > > 
> > > 
> > > -- 
> > > Best regards,
> > > Bogdan Dobrelya,
> > > Irc #bogdando
> 
> 




More information about the OpenStack-dev mailing list