In production environments, it is often necessary to slim down container images, which means making the images built from the Dockerfile small enough. This article introduces how to elegantly reduce the size of Docker images.
Benefits of Slimming Images
-
Reduces build time. -
Reduces disk usage. -
Reduces download time, speeding up container startup, which is particularly important in scenarios requiring rapid scaling. -
Reduces bandwidth pressure. In large-scale scaling scenarios, a large number of concurrent pulls of images can lead to bandwidth bottlenecks at the image repository or any node, which will affect scaling speed. Slimmed images will alleviate bandwidth pressure, thereby reducing the probability and duration of reaching bandwidth bottlenecks. -
Improves security by reducing the attack surface. Fewer unnecessary dependencies greatly reduce potential attack targets.
Using Slim Base Images
alpine
Alpine is a lightweight Linux distribution based on musl libc and busybox, designed for security, with a compressed size of around 3MB. Many popular images have base images built on Alpine.
scratch
Scratch is an empty image. If your application is a binary that contains all dependencies (not relying on dynamic libraries), you can use scratch as the base image, making the image size almost equal to that of the binary you COPY into it.
Example:
FROM scratch
COPY app /app
CMD ["/app"]
busybox
If you want the image to contain some common Linux tools, the busybox image is a good choice. It integrates over a hundred of the most commonly used Linux commands and tools into a software toolbox, with a compressed image size of less than 1MB, making it very convenient for building small images.
distroless
Distroless images contain only your application and its runtime dependencies. They do not include package managers, shells, or any other programs that you would expect to find in a standard Linux distribution. Because Distroless is a slimmed-down version of the original operating system, it does not contain extra programs. There is no shell in the container! If a hacker intrudes into our application and gains access to the container, they cannot cause much damage. In other words, the fewer programs there are, the smaller and safer the size. However, the trade-off is that debugging becomes more troublesome.
Note: We should not attach a shell to the container for debugging in production environments, but rely on proper logging and monitoring.
Example:
FROM node:8 as build
WORKDIR /app
COPY package.json index.js ./
RUN npm install
FROM gcr.io/distroless/nodejs
COPY --from=build /app /
EXPOSE 3000
CMD ["index.js"]
Distroless vs Alpine
If running in a production environment and focusing on security, Distroless images may be more suitable.
Every additional binary program in a Docker image introduces a certain risk to the entire application. Installing only one binary program in the container can reduce the overall risk.
For example, if a hacker finds a vulnerability in an application running on Distroless, they cannot create a shell in the container because it simply does not exist.
If size is a greater concern, you can switch to an Alpine base image.
Both are very small, but the trade-off is compatibility. Alpine uses a slightly different C standard library—muslc, which may occasionally lead to compatibility issues.
Native base images are very suitable for testing and development. They are larger in size but operate like Ubuntu installed on your host. Moreover, you have access to all binary programs available in that operating system.
Cleaning Package Manager Cache
When using package managers to install software packages in the Dockerfile, some cache data is often generated, which can be cleaned up to reduce image size.
Alpine
If using an Alpine base image, you can add --no-cache
when installing packages with apk add
:
FROM alpine:latest
RUN apk add --no-cache tzdata ca-certificates
Ubuntu/Debian
FROM ubuntu:latest
RUN apt update -y && apt install -y curl
RUN apt-get clean autoclean && \
apt-get autoremove --yes && \
rm -rf /var/lib/{apt,dpkg,cache,log}/
Using Multi-Stage Builds
Dockerfile supports multi-stage builds, which means there can be multiple FROM
instructions. The final image built is determined by the instructions after the last FROM
. Typically, the earlier instructions can be used for compilation, and the later instructions can be used for packaging. The packaging stage can copy files generated during the compilation stage, allowing the final image to retain only what is necessary for running the program.
Below is an example of a Dockerfile that uses Golang to statically compile a binary and then COPY it into a scratch image:
FROM golang:latest AS build
WORKDIR /workspace
COPY . .
# Statically compile binary
RUN CGO_ENABLED=0 go build -o app -ldflags '-w -extldflags "-static"' .
FROM scratch
# Copy binary to empty image
COPY --from=build /workspace/app /usr/local/bin/app
CMD ["app"]