[openstack-dev] [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes
Dan Prince
dprince at redhat.com
Tue Nov 27 18:10:50 UTC 2018
On Tue, 2018-11-27 at 16:24 +0100, Bogdan Dobrelya wrote:
> Changing the topic to follow the subject.
>
> [tl;dr] it's time to rearchitect container images to stop incluiding
> config-time only (puppet et al) bits, which are not needed runtime
> and
> pose security issues, like CVEs, to maintain daily.
I think your assertion that we need to rearchitect the config images to
container the puppet bits is incorrect here.
After reviewing the patches you linked to below it appears that you are
proposing we use --volumes-from to bind mount application binaries from
one container into another. I don't believe this is a good pattern for
containers. On baremetal if we followed the same pattern it would be
like using an /nfs share to obtain access to binaries across the
network to optimize local storage. Now... some people do this (like
maybe high performance computing would launch an MPI job like this) but
I don't think we should consider it best practice for our containers in
TripleO.
Each container should container its own binaries and libraries as much
as possible. And while I do think we should be using --volumes-from
more often in TripleO it would be for sharing *data* between
containers, not binaries.
>
> Background:
> 1) For the Distributed Compute Node edge case, there is potentially
> tens
> of thousands of a single-compute-node remote edge sites connected
> over
> WAN to a single control plane, which is having high latency, like a
> 100ms or so, and limited bandwith. Reducing the base layer size
> becomes
> a decent goal there. See the security background below.
The reason we put Puppet into the base layer was in fact to prevent it
from being downloaded multiple times. If we were to re-architect the
image layers such that the child layers all contained their own copies
of Puppet for example there would actually be a net increase in
bandwidth and disk usage. So I would argue we are already addressing
the goal of optimizing network and disk space.
Moving it out of the base layer so that you can patch it more often
without disrupting other services is a valid concern. But addressing
this concern while also preserving our definiation of a container (see
above, a container should contain all of its binaries) is going to cost
you something, namely disk and network space because Puppet would need
to be duplicated in each child container.
As Puppet is used to configure a majority of the services in TripleO
having it in the base container makes most sense. And yes, if there are
security patches for Puppet/Ruby those might result in a bunch of
containers getting pushed. But let Docker layers take care of this I
think... Don't try to solve things by constructing your own custom
mounts and volumes to work around the issue.
> 2) For a generic security (Day 2, maintenance) case, when
> puppet/ruby/systemd/name-it gets a CVE fixed, the base layer has to
> be
> updated and all layers on top - to be rebuild, and all of those
> layers,
> to be re-fetched for cloud hosts and all containers to be
> restarted...
> And all of that because of some fixes that have nothing to OpenStack.
> By
> the remote edge sites as well, remember of "tens of thousands", high
> latency and limited bandwith?..
> 3) TripleO CI updates (including puppet*) packages in containers, not
> in
> a common base layer of those. So each a CI job has to update puppet*
> and
> its dependencies - ruby/systemd as well. Reducing numbers of packages
> to
> update for each container makes sense for CI as well.
>
> Implementation related:
>
> WIP patches [0],[1] for early review, uses a config "pod" approach,
> does
> not require to maintain a two sets of config vs runtime images.
> Future
> work: a) cronie requires systemd, we'd want to fix that also off the
> base layer. b) rework to podman pods for docker-puppet.py instead of
> --volumes-from a side car container (can't be backported for Queens
> then, which is still nice to have a support for the Edge DCN case,
> at
> least downstream only perhaps).
>
> Some questions raised on IRC:
>
> Q: is having a service be able to configure itself really need to
> involve a separate pod?
> A: Highly likely yes, removing not-runtime things is a good idea and
> pods is an established PaaS paradigm already. That will require some
> changes in the architecture though (see the topic with WIP patches).
I'm a little confused on this one. Are you suggesting that we have 2
containers for each service? One with Puppet and one without?
That is certainly possible, but to pull it off would likely require you
to have things built like this:
|base container| --> |service container| --> |service container w/
Puppet installed|
The end result would be Puppet being duplicated in a layer for each
services "config image". Very inefficient.
Again, I'm ansering this assumping we aren't violating our container
constraints and best practices where each container has the binaries
its needs to do its own configuration.
>
> Q: that's (fetching a config container) actually more data that
> about to
> download otherwise
> A: It's not, if thinking of Day 2, when have to re-fetch the base
> layer
> and top layers, when some unrelated to openstack CVEs got fixed
> there
> for ruby/puppet/systemd. Avoid the need to restart service
> containers
> because of those minor updates puched is also a nice thing.
Puppet is used only for configuration in TripleO. While security issues
do need to be addressed at any layer I'm not sure there would be an
urgency to re-deploy your cluster simply for a Puppet security fix
alone. Smart change management would help eliminate blindly deploying
new containers in the case where they provide very little security
benefit.
I think the focus on Puppet, and Ruby here is perhaps a bad example as
they are config time only. Rather than just think about them we should
also consider the rest of the things in our base container images as
well. This is always going to be a "balancing act". There are pros and
cons of having things in the base layer vs. the child/leaf layers.
>
> Q: the best solution here would be using packages on the host,
> generating the config files on the host. And then having an all-in-
> one
> container for all the services which lets them run in an isolated
> mannner.
> A: I think for Edge cases, that's a no go as we might want to
> consider
> tiny low footprint OS distros like former known Container Linux or
> Atomic. Also, an all-in-one container looks like an anti-pattern
> from
> the world of VMs.
This was suggested on IRC because it likely gives you the smallest
network/storage footprint for each edge node. The container would get
used for everything: running all the services, and configuring all the
services. Sort of a golden image approach. It may be an anti-pattern
but initially I thought you were looking to optimize these things.
I think a better solution might be to have container registries, or
container mirrors (reverse proxies or whatever) that allow you to cache
things as you deploy to the edge and thus optimize the network traffic.
>
> [0] https://review.openstack.org/#/q/topic:base-container-reduction
> [1]
> https://review.rdoproject.org/r/#/q/topic:base-container-reduction
>
> > Here is a related bug [1] and implementation [1] for that. PTAL
> > folks!
> >
> > [0] https://bugs.launchpad.net/tripleo/+bug/1804822
> > [1] https://review.openstack.org/#/q/topic:base-container-reduction
> >
> > > Let's also think of removing puppet-tripleo from the base
> > > container.
> > > It really brings the world-in (and yum updates in CI!) each job
> > > and each
> > > container!
> > > So if we did so, we should then either install puppet-tripleo and
> > > co on
> > > the host and bind-mount it for the docker-puppet deployment task
> > > steps
> > > (bad idea IMO), OR use the magical --volumes-from <a-side-car-
> > > container>
> > > option to mount volumes from some "puppet-config" sidecar
> > > container
> > > inside each of the containers being launched by docker-puppet
> > > tooling.
> >
> > On Wed, Oct 31, 2018 at 11:16 AM Harald Jensås <hjensas at
> > redhat.com>
> > wrote:
> > > We add this to all images:
> > >
> > > https://github.com/openstack/tripleo-common/blob/d35af75b0d8c4683a677660646e535cf972c98ef/container-images/tripleo_kolla_template_overrides.j2#L35
> > >
> > > /bin/sh -c yum -y install iproute iscsi-initiator-utils lvm2
> > > python
> > > socat sudo which openstack-tripleo-common-container-base rsync
> > > cronie
> > > crudini openstack-selinux ansible python-shade puppet-tripleo
> > > python2-
> > > kubernetes && yum clean all && rm -rf /var/cache/yum 276 MB
> > >
> > > Is the additional 276 MB reasonable here?
> > > openstack-selinux <- This package run relabling, does that kind
> > > of
> > > touching the filesystem impact the size due to docker layers?
> > >
> > > Also: python2-kubernetes is a fairly large package (18007990) do
> > > we use
> > > that in every image? I don't see any tripleo related repos
> > > importing
> > > from that when searching on Hound? The original commit message[1]
> > > adding it states it is for future convenience.
> > >
> > > On my undercloud we have 101 images, if we are downloading every
> > > 18 MB
> > > per image thats almost 1.8 GB for a package we don't use? (I hope
> > > it's
> > > not like this? With docker layers, we only download that 276 MB
> > > transaction once? Or?)
> > >
> > >
> > > [1] https://review.openstack.org/527927
> >
> >
> > --
> > Best regards,
> > Bogdan Dobrelya,
> > Irc #bogdando
>
>
More information about the OpenStack-dev
mailing list