[TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Dan Prince dprince at redhat.com
Wed Nov 28 14:25:13 UTC 2018


On Wed, 2018-11-28 at 15:12 +0100, Bogdan Dobrelya wrote:
> On 11/28/18 2:58 PM, Dan Prince wrote:
> > On Wed, 2018-11-28 at 12:45 +0100, Bogdan Dobrelya wrote:
> > > To follow up and explain the patches for code review:
> > > 
> > > The "header" patch https://review.openstack.org/620310 ->
> > > (requires)
> > > https://review.rdoproject.org/r/#/c/17534/, and also
> > > https://review.openstack.org/620061 -> (which in turn requires)
> > > https://review.openstack.org/619744 -> (Kolla change, the 1st to
> > > go)
> > > https://review.openstack.org/619736
> > 
> > This email was cross-posted to multiple lists and I think we may
> > have
> > lost some of the context in the process as the subject was changed.
> > 
> > Most of the suggestions and patches are about making our base
> > container(s) smaller in size. And the means by which the patches do
> > that is to share binaries/applications across containers with
> > custom
> > mounts/volumes. I've -2'd most of them. What concerns me however is
> > that some of the TripleO cores seemed open to this idea yesterday
> > on
> > IRC. Perhaps I've misread things but what you appear to be doing
> > here
> > is quite drastic I think we need to consider any of this carefully
> > before proceeding with any of it.
> > 
> > 
> > > Please also read the commit messages, I tried to explain all
> > > "Whys"
> > > very
> > > carefully. Just to sum up it here as well:
> > > 
> > > The current self-containing (config and runtime bits)
> > > architecture
> > > of
> > > containers badly affects:
> > > 
> > > * the size of the base layer and all containers images as an
> > >     additional 300MB (adds an extra 30% of size).
> > 
> > You are accomplishing this by removing Puppet from the base
> > container,
> > but you are also creating another container in the process. This
> > would
> > still be required on all nodes as Puppet is our config tool. So you
> > would still be downloading some of this data anyways. Understood
> > your
> > reasons for doing this are that it avoids rebuilding all containers
> > when there is a change to any of these packages in the base
> > container.
> > What you are missing however is how often is it the case that
> > Puppet is
> > updated that something else in the base container isn't?
> 
> For CI jobs updating all containers, its quite an often to have
> changes 
> in openstack/tripleo puppet modules to pull in. IIUC, that
> automatically 
> picks up any updates for all of its dependencies and for the 
> dependencies of dependencies, and all that multiplied by a hundred
> of 
> total containers to get it updated. That is a *pain* we're used to
> have 
> these day for quite often timing out CI jobs... Ofc, the main cause
> is 
> delayed promotions though.

Regarding CI I made a separate suggestion on that below in that
rebuilding the base layer more often could be a good solution here. I
don't think the puppet-tripleo package is that large however so we
could just live with it.

> 
> For real deployments, I have no data for the cadence of minor updates
> in 
> puppet and tripleo & openstack modules for it, let's ask operators
> (as 
> we're happened to be in the merged openstack-discuss list)? For its 
> dependencies though, like systemd and ruby, I'm pretty sure it's
> quite 
> often to have CVEs fixed there. So I expect what "in the fields" 
> security fixes delivering for those might bring some unwanted hassle
> for 
> long-term maintenance of LTS releases. As Tengu noted on IRC:
> "well, between systemd, puppet and ruby, there are many security 
> concernes, almost every month... and also, what's the point keeping
> them 
> in runtime containers when they are useless?"

Reiterating again on previous points:

-I'd be fine removing systemd. But lets do it properly and not via 'rpm
-ev --nodeps'.
-Puppet and Ruby *are* required for configuration. We can certainly put
them in a separate container outside of the runtime service containers
but doing so would actually cost you much more space/bandwidth for each
service container. As both of these have to get downloaded to each node
anyway in order to generate config files with our current mechanisms
I'm not sure this buys you anything.

We are going in circles here I think....

Dan

> 
> > I would wager that it is more rare than you'd think. Perhaps
> > looking at
> > the history of an OpenStack distribution would be a valid way to
> > assess
> > this more critically. Without this data to backup the numbers I'm
> > afraid what you are doing here falls into "pre-optimization"
> > territory
> > for me and I don't think the means used in the patches warrent the
> > benefits you mention here.
> > 
> > 
> > > * Edge cases, where we have containers images to be distributed,
> > > at
> > >     least once to hit local registries, over high-latency and
> > > limited
> > >     bandwith, highly unreliable WAN connections.
> > > * numbers of packages to update in CI for all containers for all
> > >     services (CI jobs do not rebuild containers so each container
> > > gets
> > >     updated for those 300MB of extra size).
> > 
> > It would seem to me there are other ways to solve the CI containers
> > update problems. Rebuilding the base layer more often would solve
> > this
> > right? If we always build our service containers off of a base
> > layer
> > that is recent there should be no updates to the system/puppet
> > packages
> > there in our CI pipelines.
> > 
> > > * security and the surface of attacks, by introducing systemd et
> > > al
> > > as
> > >     additional subjects for CVE fixes to maintain for all
> > > containers.
> > 
> > We aren't actually using systemd within our containers. I think
> > those
> > packages are getting pulled in by an RPM dependency elsewhere. So
> > rather than using 'rpm -ev --nodeps' to remove it we could create a
> > sub-package for containers in those cases and install it instead.
> > In
> > short rather than hack this to remove them why not pursue a proper
> > packaging fix?
> > 
> > In general I am a fan of getting things out of the base container
> > we
> > don't need... so yeah lets do this. But lets do it properly.
> > 
> > > * services uptime, by additional restarts of services related to
> > >     security maintanence of irrelevant to openstack components
> > > sitting
> > >     as a dead weight in containers images for ever.
> > 
> > Like I said above how often is it that these packages actually
> > change
> > where something else in the base container doesn't? Perhaps we
> > should
> > get more data here before blindly implementing a solution we aren't
> > sure really helps out in the real world.
> > 
> > > On 11/27/18 4:08 PM, Bogdan Dobrelya wrote:
> > > > Changing the topic to follow the subject.
> > > > 
> > > > [tl;dr] it's time to rearchitect container images to stop
> > > > incluiding
> > > > config-time only (puppet et al) bits, which are not needed
> > > > runtime
> > > > and
> > > > pose security issues, like CVEs, to maintain daily.
> > > > 
> > > > Background: 1) For the Distributed Compute Node edge case,
> > > > there
> > > > is
> > > > potentially tens of thousands of a single-compute-node remote
> > > > edge
> > > > sites
> > > > connected over WAN to a single control plane, which is having
> > > > high
> > > > latency, like a 100ms or so, and limited bandwith.
> > > > 2) For a generic security case,
> > > > 3) TripleO CI updates all
> > > > 
> > > > Challenge:
> > > > 
> > > > > Here is a related bug [1] and implementation [1] for that.
> > > > > PTAL
> > > > > folks!
> > > > > 
> > > > > [0] https://bugs.launchpad.net/tripleo/+bug/1804822
> > > > > [1]
> > > > > https://review.openstack.org/#/q/topic:base-container-reduction
> > > > > 
> > > > > > Let's also think of removing puppet-tripleo from the base
> > > > > > container.
> > > > > > It really brings the world-in (and yum updates in CI!) each
> > > > > > job
> > > > > > and
> > > > > > each container!
> > > > > > So if we did so, we should then either install puppet-
> > > > > > tripleo
> > > > > > and co
> > > > > > on the host and bind-mount it for the docker-puppet
> > > > > > deployment
> > > > > > task
> > > > > > steps (bad idea IMO), OR use the magical --volumes-from
> > > > > > <a-side-car-container> option to mount volumes from some
> > > > > > "puppet-config" sidecar container inside each of the
> > > > > > containers
> > > > > > being
> > > > > > launched by docker-puppet tooling.
> > > > > 
> > > > > On Wed, Oct 31, 2018 at 11:16 AM Harald Jensås <hjensas at
> > > > > redhat.com>
> > > > > wrote:
> > > > > > We add this to all images:
> > > > > > 
> > > > > > https://github.com/openstack/tripleo-common/blob/d35af75b0d8c4683a677660646e535cf972c98ef/container-images/tripleo_kolla_template_overrides.j2#L35
> > > > > > 
> > > > > > 
> > > > > > /bin/sh -c yum -y install iproute iscsi-initiator-utils
> > > > > > lvm2
> > > > > > python
> > > > > > socat sudo which openstack-tripleo-common-container-base
> > > > > > rsync
> > > > > > cronie
> > > > > > crudini openstack-selinux ansible python-shade puppet-
> > > > > > tripleo
> > > > > > python2-
> > > > > > kubernetes && yum clean all && rm -rf /var/cache/yum 276 MB
> > > > > > Is the additional 276 MB reasonable here?
> > > > > > openstack-selinux <- This package run relabling, does that
> > > > > > kind
> > > > > > of
> > > > > > touching the filesystem impact the size due to docker
> > > > > > layers?
> > > > > > 
> > > > > > Also: python2-kubernetes is a fairly large package
> > > > > > (18007990)
> > > > > > do we use
> > > > > > that in every image? I don't see any tripleo related repos
> > > > > > importing
> > > > > > from that when searching on Hound? The original commit
> > > > > > message[1]
> > > > > > adding it states it is for future convenience.
> > > > > > 
> > > > > > On my undercloud we have 101 images, if we are downloading
> > > > > > every 18 MB
> > > > > > per image thats almost 1.8 GB for a package we don't use?
> > > > > > (I
> > > > > > hope it's
> > > > > > not like this? With docker layers, we only download that
> > > > > > 276 MB
> > > > > > transaction once? Or?)
> > > > > > 
> > > > > > 
> > > > > > [1] https://review.openstack.org/527927
> > > > > 
> > > > > -- 
> > > > > Best regards,
> > > > > Bogdan Dobrelya,
> > > > > Irc #bogdando
> 
> 




More information about the openstack-discuss mailing list