Docker Learnings

May 21, 2024

I recently worked on a project to optimize a Docker image for a node app and address some problems we'd been experiencing. We use Buildkite, and the Buildkite agents were having out of memory problems during the step that builds the image. Also, the image seemed too big at 2.7GB.

Briefly, here are some of the things I learned during my investigation:

  • Use layering to avoid unnecessarily re-running steps. For example, if you copy over the package.json file and run install before copying over everything else and running yarn build, Docker will only re-install them if the package.json has changed. If, on the other hand, you first copy over everything, Docker will run install every time any file changes, regardless of whether it's actually needed.
  • Use multistage builds: the Docker image will only be as large as it needs to be to contain the precise files that you need.
  • Consider using turbo prune Assuming you work in a monorepo, this will generate a folder with just the files you need.
  • Consider using the output: 'standalone' option in next.config.js, which creates a folder with just the files you need (see docs)

I didn't use Turbo or the output: 'standalone' option, although I'd like to explore these in a follow-up phase. Instead, I built the app in a seperate buildkite pipeline step prior to the Docker build step. While I'm not positive this is the best solution, it does seem to be correlated with reducing memory usage of the Docker build step (see images below).

After building the app, we use a multistage build to, first, import the app, then remove all the parts we don't need for production (src files, etc), then copy just what we need into the final stage.

At the end of this process, our Docker image was down from nearly 2.7GB to more like 1.8GB. Here's a Dockerimage that shows our basic approach:

# First of two stages of the multistage Dockerfile
FROM nodeBaseImage as installer

WORKDIR /home/node/app

# Copy package.json and install in a layer before other layers
COPY ./package.json package.json
RUN yarn --immutable

COPY . .

# The tools and libs folders contain many sub-folders that we don't need; this command removes themn
RUN find tools/ libs/ -type d \( -name src -o -name translations -o -name cypress -o -name .storybook -o -name .turbo \) ! -path "*/dist/*" -exec rm -rf {} +
# Delete all the files in tools and libs except those in the dist folders
RUN find tools/ libs/ -type f ! -path "*/dist/*" -exec rm -f {} +

# Set the user to be node for security reasons.
# This is a best practice per https://github.com/nodejs/docker-node/blob/main/docs/BestPractices.md#non-root-user
USER node

# Make a second stage
FROM nodeBaseImage

WORKDIR /home/node/app

COPY --from=installer home/node/node_modules node_modules
COPY --from=installer home/node/apps/my-app apps/my-app

EXPOSE 3000

CMD [ "yarn", "workspace", "@namespace/app-name", "next"]

Some useful resources:

Screenshots

docker-phobia Docker phobia

memory Memory usage on Builkite agent