[TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Bogdan Dobrelya bdobreli at redhat.com
Wed Nov 28 14:12:08 UTC 2018


On 11/28/18 2:58 PM, Dan Prince wrote:
> On Wed, 2018-11-28 at 12:45 +0100, Bogdan Dobrelya wrote:
>> To follow up and explain the patches for code review:
>>
>> The "header" patch https://review.openstack.org/620310 -> (requires)
>> https://review.rdoproject.org/r/#/c/17534/, and also
>> https://review.openstack.org/620061 -> (which in turn requires)
>> https://review.openstack.org/619744 -> (Kolla change, the 1st to go)
>> https://review.openstack.org/619736
> 
> This email was cross-posted to multiple lists and I think we may have
> lost some of the context in the process as the subject was changed.
> 
> Most of the suggestions and patches are about making our base
> container(s) smaller in size. And the means by which the patches do
> that is to share binaries/applications across containers with custom
> mounts/volumes. I've -2'd most of them. What concerns me however is
> that some of the TripleO cores seemed open to this idea yesterday on
> IRC. Perhaps I've misread things but what you appear to be doing here
> is quite drastic I think we need to consider any of this carefully
> before proceeding with any of it.
> 
> 
>>
>> Please also read the commit messages, I tried to explain all "Whys"
>> very
>> carefully. Just to sum up it here as well:
>>
>> The current self-containing (config and runtime bits) architecture
>> of
>> containers badly affects:
>>
>> * the size of the base layer and all containers images as an
>>     additional 300MB (adds an extra 30% of size).
> 
> You are accomplishing this by removing Puppet from the base container,
> but you are also creating another container in the process. This would
> still be required on all nodes as Puppet is our config tool. So you
> would still be downloading some of this data anyways. Understood your
> reasons for doing this are that it avoids rebuilding all containers
> when there is a change to any of these packages in the base container.
> What you are missing however is how often is it the case that Puppet is
> updated that something else in the base container isn't?

For CI jobs updating all containers, its quite an often to have changes 
in openstack/tripleo puppet modules to pull in. IIUC, that automatically 
picks up any updates for all of its dependencies and for the 
dependencies of dependencies, and all that multiplied by a hundred of 
total containers to get it updated. That is a *pain* we're used to have 
these day for quite often timing out CI jobs... Ofc, the main cause is 
delayed promotions though.

For real deployments, I have no data for the cadence of minor updates in 
puppet and tripleo & openstack modules for it, let's ask operators (as 
we're happened to be in the merged openstack-discuss list)? For its 
dependencies though, like systemd and ruby, I'm pretty sure it's quite 
often to have CVEs fixed there. So I expect what "in the fields" 
security fixes delivering for those might bring some unwanted hassle for 
long-term maintenance of LTS releases. As Tengu noted on IRC:
"well, between systemd, puppet and ruby, there are many security 
concernes, almost every month... and also, what's the point keeping them 
in runtime containers when they are useless?"

> 
> I would wager that it is more rare than you'd think. Perhaps looking at
> the history of an OpenStack distribution would be a valid way to assess
> this more critically. Without this data to backup the numbers I'm
> afraid what you are doing here falls into "pre-optimization" territory
> for me and I don't think the means used in the patches warrent the
> benefits you mention here.
> 
> 
>> * Edge cases, where we have containers images to be distributed, at
>>     least once to hit local registries, over high-latency and limited
>>     bandwith, highly unreliable WAN connections.
>> * numbers of packages to update in CI for all containers for all
>>     services (CI jobs do not rebuild containers so each container gets
>>     updated for those 300MB of extra size).
> 
> It would seem to me there are other ways to solve the CI containers
> update problems. Rebuilding the base layer more often would solve this
> right? If we always build our service containers off of a base layer
> that is recent there should be no updates to the system/puppet packages
> there in our CI pipelines.
> 
>> * security and the surface of attacks, by introducing systemd et al
>> as
>>     additional subjects for CVE fixes to maintain for all containers.
> 
> We aren't actually using systemd within our containers. I think those
> packages are getting pulled in by an RPM dependency elsewhere. So
> rather than using 'rpm -ev --nodeps' to remove it we could create a
> sub-package for containers in those cases and install it instead. In
> short rather than hack this to remove them why not pursue a proper
> packaging fix?
> 
> In general I am a fan of getting things out of the base container we
> don't need... so yeah lets do this. But lets do it properly.
> 
>> * services uptime, by additional restarts of services related to
>>     security maintanence of irrelevant to openstack components sitting
>>     as a dead weight in containers images for ever.
> 
> Like I said above how often is it that these packages actually change
> where something else in the base container doesn't? Perhaps we should
> get more data here before blindly implementing a solution we aren't
> sure really helps out in the real world.
> 
>>
>> On 11/27/18 4:08 PM, Bogdan Dobrelya wrote:
>>> Changing the topic to follow the subject.
>>>
>>> [tl;dr] it's time to rearchitect container images to stop
>>> incluiding
>>> config-time only (puppet et al) bits, which are not needed runtime
>>> and
>>> pose security issues, like CVEs, to maintain daily.
>>>
>>> Background: 1) For the Distributed Compute Node edge case, there
>>> is
>>> potentially tens of thousands of a single-compute-node remote edge
>>> sites
>>> connected over WAN to a single control plane, which is having high
>>> latency, like a 100ms or so, and limited bandwith.
>>> 2) For a generic security case,
>>> 3) TripleO CI updates all
>>>
>>> Challenge:
>>>
>>>> Here is a related bug [1] and implementation [1] for that. PTAL
>>>> folks!
>>>>
>>>> [0] https://bugs.launchpad.net/tripleo/+bug/1804822
>>>> [1]
>>>> https://review.openstack.org/#/q/topic:base-container-reduction
>>>>
>>>>> Let's also think of removing puppet-tripleo from the base
>>>>> container.
>>>>> It really brings the world-in (and yum updates in CI!) each job
>>>>> and
>>>>> each container!
>>>>> So if we did so, we should then either install puppet-tripleo
>>>>> and co
>>>>> on the host and bind-mount it for the docker-puppet deployment
>>>>> task
>>>>> steps (bad idea IMO), OR use the magical --volumes-from
>>>>> <a-side-car-container> option to mount volumes from some
>>>>> "puppet-config" sidecar container inside each of the containers
>>>>> being
>>>>> launched by docker-puppet tooling.
>>>>
>>>> On Wed, Oct 31, 2018 at 11:16 AM Harald Jensås <hjensas at
>>>> redhat.com>
>>>> wrote:
>>>>> We add this to all images:
>>>>>
>>>>> https://github.com/openstack/tripleo-common/blob/d35af75b0d8c4683a677660646e535cf972c98ef/container-images/tripleo_kolla_template_overrides.j2#L35
>>>>>
>>>>>
>>>>> /bin/sh -c yum -y install iproute iscsi-initiator-utils lvm2
>>>>> python
>>>>> socat sudo which openstack-tripleo-common-container-base rsync
>>>>> cronie
>>>>> crudini openstack-selinux ansible python-shade puppet-tripleo
>>>>> python2-
>>>>> kubernetes && yum clean all && rm -rf /var/cache/yum 276 MB
>>>>> Is the additional 276 MB reasonable here?
>>>>> openstack-selinux <- This package run relabling, does that kind
>>>>> of
>>>>> touching the filesystem impact the size due to docker layers?
>>>>>
>>>>> Also: python2-kubernetes is a fairly large package (18007990)
>>>>> do we use
>>>>> that in every image? I don't see any tripleo related repos
>>>>> importing
>>>>> from that when searching on Hound? The original commit
>>>>> message[1]
>>>>> adding it states it is for future convenience.
>>>>>
>>>>> On my undercloud we have 101 images, if we are downloading
>>>>> every 18 MB
>>>>> per image thats almost 1.8 GB for a package we don't use? (I
>>>>> hope it's
>>>>> not like this? With docker layers, we only download that 276 MB
>>>>> transaction once? Or?)
>>>>>
>>>>>
>>>>> [1] https://review.openstack.org/527927
>>>>
>>>>
>>>> -- 
>>>> Best regards,
>>>> Bogdan Dobrelya,
>>>> Irc #bogdando
>>
>>
> 


-- 
Best regards,
Bogdan Dobrelya,
Irc #bogdando



More information about the openstack-discuss mailing list