[openstack-dev] [devstack] Why do we apt-get install NEW files/debs/general at job time ?
David Moreau Simard
dms at redhat.com
Tue Sep 19 23:30:35 UTC 2017
On Tue, Sep 19, 2017 at 9:03 AM, Jeremy Stanley <fungi at yuggoth.org> wrote:
> In order to reduce image sizes and the time it takes to build
> images, once we had local package caches in each provider we stopped
> pre-retrieving packages onto the images. Is the time spent at this
> stage mostly while downloading package files (which is what that
> used to alleviate) or is it more while retrieving indices or
> installing the downloaded packages (things having them pre-retrieved
> on the images never solved anyway)?
At what point does it become beneficial to build more than one image per OS
that is more aggressively tuned/optimized for a particular purpose ?
We could take more freedom in a devstack-specific image like pre-install
packages that are provided out of base OS, etc.
Different projects could take this kind of freedom to optimize build times
according to their needs as well.
Here's an example of something we once did in RDO:
1) Aggregate the list of every package installed (rpm -qa) at the end
of several jobs
2) From that sorted and uniq'd list, work out which repositories each
package came from
3) Blacklist every package that was not installed from a base
operating system repository
(i.e, blacklist every package and dependencies from RDO, since
we'll be testing these)
4) Pre-install every package that were not blacklisted in our images
The end result was a list of >700 packages  completely unrelated to
OpenStack that ended up
being installed anyway throughout different jobs.
To give an idea of numbers, a fairly vanilla CentOS image has ~400
You can find the (rudimentary) script to achieve this filtering is here .
David Moreau Simard
Senior Software Engineer | OpenStack RDO
dmsimard = [irc, github, twitter]
More information about the OpenStack-dev