Shrinking your Docker containers using Multistaging

Your average container may waste a lot of space if you are not careful. Yes, you can apply some good patterns and, still, you will end with a container that weights hundreds of megabytes, maybe more. Even if you just want to run a 6 MB binary. What can we do?

Note: all files used in this post are available here

Let’s begin with a basic http server done in go.  This go http server example is inspired in this golang tutorial.

package main

import (
   "fmt"
   "log"
   "net/http"
)

func handler(w http.ResponseWriter, r *http.Request) {
   fmt.Fprintf(w, "I'm a web server running in go!")
}

func main() {
   http.HandleFunc("/", handler)
   log.Fatal(http.ListenAndServe(":8080", nil))
}

The dystopian approach

Let’s begin with a classical example where even it works, the image is too huge. The dockerfile is as follows:

FROM ubuntu
RUN apt-get update -y -q && apt-get upgrade -y -q
RUN DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y -q curl build-essential \
    ca-certificates git && rm -fr /var/lib/apt/lists/*
RUN curl -s https://storage.googleapis.com/golang/go1.10.1.linux-amd64.tar.gz| tar -v -C /usr/local -xz
ENV PATH $PATH:/usr/local/go/bin
RUN adduser --system --home /app user
ADD go-http-server.go /app/
WORKDIR /app
RUN go build -o main .
USER user
CMD ["./main"]

Even that there are some good practices here, like installing some dependencies in one line or using an alternate user to run the go binary the overall strategy is a disaster because you don’t need to run an Ubuntu distribution, update the system, upgrade the system, install all required build libraries and download all the golang binaries to finally be able to run the 6 MB executable file. If you build the previous example you will end with a 705MB container.

$ docker build -t golang:ubuntu . -f Dockerfile.ubuntu
$ docker images
REPOSITORY    TAG        IMAGE ID       CREATED          SIZE
ubuntu-http   standard   fb558c6a0209   2 minutes ago    705MB

And we can do it better right?

The Alpine Linux Approach

An Alpine Linux container as replacement always allows your containers to be lighter. Of course, some of your initial commands may vary as some syntax may be different. There is always an Alpine Linux Image ready for your project. For this example, I will use an image called golang:alpine, a small container with the golang already installed.

FROM golang:alpine
RUN mkdir /app
ADD go-http-server.go /app/
WORKDIR /app
RUN go build -o main .
RUN adduser -S -D -H -h /app user
USER user
CMD ["./main"]

By assuming that the golang is present in the image we just compile the http server, create the user that will run the app and we are done. If we build the image we will end with a 357MB image. Half of the last approach, but still large if we want to run just a 6MB http server, right?

$ docker build -t golang:standard . -f Dockerfile.standard
$ docker images
REPOSITORY    TAG        IMAGE ID       CREATED          SIZE
golang        standard   4121327179ac   3 seconds ago    357MB

And then, there is…

Multistage container building

Imagine that you can get rid of everything that you don’t need in your container. I’m talking about all the libraries needed to build the binary, in this case. As golang is compiled statically, you can run it in an Alpine basic image. But still, you need the golang binaries to compile the code, right? Take a look at the next Dockerfile:

FROM golang:alpine as builder
RUN mkdir /build
ADD . /build/
WORKDIR /build
RUN go build -o main .
FROM alpine
RUN adduser -S -D -H -h /app user
USER user
COPY --from=builder /build/main /app/
WORKDIR /app
CMD ["./main"]

What this dockerfile does is to use the golang:alpine image to build the binary in the /build directory, and then, define a second Alpine container that will get the resulting binary from the /build directory (we already know that the output binary will be called main) and copy it in the /app directory on the second Alpine Linux. How good is that solution?

$ docker build -t golang:multistage . -f Dockerfile.multistage
$ docker images
REPOSITORY TAG      IMAGE ID     CREATED       SIZE
golang     standard aab982a2c2b2 3 seconds ago 12.9MB

We have reduced our container from 705MB to 12.9MB! But we can do even better!

FROM scratch

Well, we have removed all the build dependencies from our container so it is lighter. What can we do next?

We can use a reserved, minimal image called scratch to use it as a starting point to build our containers:

FROM golang:alpine as builder
RUN mkdir /build
ADD go-http-server.go /build/
WORKDIR /build
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -ldflags '-w -s' -o main .
FROM scratch
COPY --from=builder /build/main /app/
WORKDIR /app
CMD ["./main"]

As you can see in the go build execution we are trying to optimize the compilation (by eliminating debug) information. The result will be even smaller than using Alpine Linux!

$ docker build -t golang:scratch . -f Dockerfile.scratch 
$ docker images 
REPOSITORY TAG      IMAGE ID     CREATED        SIZE
golang     standard 52cd30af14f1 39 seconds ago 5.34MB

The choice is yours!

$ docker images 
REPOSITORY TAG        IMAGE ID     CREATED           SIZE
golang     scratch    52cd30af14f1 36 minutes ago    5.34MB
golang     multistage aab982a2c2b2 42 minutes ago    12.9MB
golang     standard   4121327179ac About an hour ago 357MB
golang     ubuntu     fb558c6a0209 About an hour ago 705MB

 

Leave a comment