Skip to content
This repository was archived by the owner on Oct 18, 2022. It is now read-only.
This repository was archived by the owner on Oct 18, 2022. It is now read-only.

Docker images contain 20MB of deleted /var/lib/apt/lists/ files #90

Closed
@edmorley

Description

@edmorley

Hi! Before I dive in, I just want to say thank you for maintaining these images :-)

So I happened to notice this section present in the Dockerfiles (generated here):

# delete all the apt list files since they're big and get stale quickly
RUN rm -rf /var/lib/apt/lists/*
# this forces "apt-get update" in dependent images, which is also good

I agree it's a good idea to remove these files to force later "apt-get update", however the comment about saving space is not correct, since deleting files in a layer after they've already been added won't free up the space. The comment seems to have been copy-pasted from this script (which isn't run across multiple layers so actually does save space).

Rather than just correcting the comment, it would be best to avoid the 20MB wasted space in the first place.

The files in /var/lib/apt/lists/ come from the base image archive from Canonical, which is directly extracted using the ADD command's tar file support. This cannot be switched to the curl/untar/delete pattern used in downstream images, since until the base archive is extracted there are no binaries in the image to use. As such, the removal of /var/lib/apt/lists/ needs to occur prior to the Docker build process.

This example shows the Ubuntu 16.04 image being reduced from 118MB to 97.6MB by doing exactly that...

#!/bin/bash

# Fetch base archive and Dockerfile used for the existing Ubuntu 16.04 image
curl -fLO https://partner-images.canonical.com/core/xenial/current/ubuntu-xenial-core-cloudimg-amd64-root.tar.gz
curl -fLO https://raw.githubusercontent.com/tianon/docker-brew-ubuntu-core/dist-amd64/xenial/Dockerfile

# Prepare a slimmed down version
gzip -dc ubuntu-xenial-core-cloudimg-amd64-root.tar.gz | tar --delete --wildcards 'var/lib/apt/lists/*' | gzip > rootfs-minimised.tar.gz
sed 's/ubuntu-xenial-core-cloudimg-amd64-root\.tar\.gz/rootfs-minimised\.tar\.gz/' Dockerfile > Dockerfile-new

# Compare the before/after
docker build -t ubuntu-16.04-test:before .
docker build -t ubuntu-16.04-test:after -f Dockerfile-new .
docker images ubuntu-16.04-test

Output:

REPOSITORY          TAG                 IMAGE ID            CREATED                  SIZE
ubuntu-16.04-test   after               7b258205a6b1        Less than a second ago   97.6MB
ubuntu-16.04-test   before              65cb86c05710        13 seconds ago           118MB

I guess the question will be whether to store both the original base archive and the processed one in this repo (so people can still use the hashes and compare), or whether to just store the processed one.

Also, I think it's worth pushing the upstream maintainers of these base images to remove the APT lists from them, which will avoid all of this busywork. Perhaps this size-reduction use-case is a more compelling one for them than that outlined here:
https://bugs.launchpad.net/cloud-images/+bug/1685399

Many thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions