Building multi-architecture Docker images: why and how

Docker is quite a famous piece of tech. It solves the problem of making software that runs easily across machines (mostly for Linux servers). I won’t get into details, but it uses Kernel level functions to isolate containerized programs so that we don’t need to bother about installing the right dependencies on every machine we want our software to run. Instead, we build a docker image, publish it and whoever has docker in their machine can then run it as well as long as the machine has Docker installed.

What I’ve said above is true. Well, most of the time. If the machines you want to run your software in have different architectures, some extra hard work will be needed to make it seamless.

This article will show you when such a problem happens, what computer architectures are (in a nutshell), and finally how to build docker images that work for multiple architectures. Enjoy!

When `docker run` did not work on my shiny new computer

Early 2021 I sold my MacBook Pro 15’ from 2017 and got myself a MacBook Air M1. As my sole computer, I had to use it to run docker, which was even in beta at the time for the M1. A co-worker published a docker image in our private GitLab registry, which I then tried to run. It was on the lines of:

$ docker run some-image
... exec user process caused "exec format error".

The first time I looked at the error above, I wondered what the heck was going on. After using my favorite search engine I found out the error was saying “we have an incompatible binary”. The image was built for another architecture, which in the case was x86_64 (popular due to Intel and AMD CPUs) and my new toy (the M1) was aarch64 (ARM 64 bits).

What are computer architectures?

Computers do not understand each other by default. For one to run a piece of software, you need a binary program matching a specific Instruction Set Architecture (ISA).

An ISA specifies the behavior of machine code running on implementations of that ISA in a fashion that does not depend on the characteristics of that implementation, providing binary compatibility between implementations. - Wikipedia

For this article, the above is basically what we need to know about computer architectures.

Why do we have different computer architectures?

An architecture defines how a processing unit understands binary code. However, it also defines how the components of a computer system are organized, which by itself can impact the real world in so many ways. As we have applications with their own goals, different architectures naturally emerge.

Take Intel for instance. They build processing units that are backward compatible and most of the instructions supported 10 years ago or more are still there. This may limit how performant or efficient they can make a system, as they will avoid breaking it for customers.

On the other hand, when smartphones came out and efficiency was paramount, using an architecture with a lot of cruft and legacy components like amd64 was just not an option. Architectures like Arm, already leaner due to being a RISC architecture, were built with that purpose and they managed to be more efficient so you can actually run them in smartphones whose batteries last the whole day (sometimes).

For similar reasons, we are starting to see Arm on the server. Big cloud providers are now offering Arm-based virtual machines. Amazon has its Graviton processors, which from what I recall are cheaper than their x86 counterparts. They even have Apple M1 Mac instances. Oracle has Ampere Altra Processors and they even offer a quite interesting always free tier that includes 4 Arm cores and 24GB of RAM per month.

One could even build an energy-efficient server using a Raspberry Pi or equivalent Single Board Computer.

As I am interested in deploying a server-side app, those options became quite endearing, which lead me to need another time to build docker images that run on non-amd64 architectures.

What are multi-architecture docker images?

A docker image may support a single architecture or more. Many images support only one, and those are simpler to make. At the time of writing, the swift docker image only supports amd64 (x86_64):

On the other hand, the Ubuntu image supports a lot of architectures:

When you docker pull a multi-arch Docker image, docker identifies which one is needed for your computer architecture, or for the architecture you told it with a command-line argument. For instance, let’s try to pull ubuntu for arm64:

$ docker pull ubuntu --platform arm64
Using default tag: latest
latest: Pulling from library/ubuntu
5f3d23ccb99f: Pull complete
Digest: sha256:b5a61709a9a44284d88fb12e5c48db0409cfad5b69d4ff8224077c57302df9cf
Status: Downloaded newer image for ubuntu:latest
docker.io/library/ubuntu:latest

That worked nicely. I could have used amd64 and it would have worked as well, because the ubuntu image supports it:

$ docker pull ubuntu --platform amd64
Using default tag: latest
latest: Pulling from library/ubuntu
ea362f368469: Pull complete
Digest: sha256:b5a61709a9a44284d88fb12e5c48db0409cfad5b69d4ff8224077c57302df9cf
Status: Downloaded newer image for ubuntu:latest
docker.io/library/ubuntu:latest

As my computer is arm64, it may not be able to run the amd64 unless some emulation technology like QEMU is used.

If we tried to pull an arm64 image the same for swift, it wouldn’t work:

$ docker pull swift --platform arm64
Using default tag: latest
latest: Pulling from library/swift
no matching manifest for linux/arm64 in the manifest list entries

We will soon see how to address the above by building our own multi-arch image with arm64 and amd64.

How to build multi-architecture docker images?

What do you need?

There are two ways to build multi-arch Docker images:

Having a machine for each architecture you want to build for.
Using QEMU.

The first option is faster and less error-prone. I had many issues when trying to build swift apps using QEMU, for both arm64 and amd64 hosts. It was very slow and many times it crashed in the middle for reasons I don’t recall.

The rest of this article uses both options in different examples as better suited for each.

Multi-arch swift image with QEMU

Earlier we mentioned the swift image does not support arm64 at the time of this writing. However, we do have swift docker images for arm made by swift-arm. In a Dockerfile, you could have a variable to indicate which image to use for the current architecture, but it is just neater if you have a multi-arch docker image, as you just write its name.

As the images are already built, the process is quite simple:

Create arch-specific tags
Create a manifest
Publish it

Create arch specific tags

#!/usr/bin/env bash

export SWIFT_VERSION=5.5.2
# change "ataias" to your own username here to be able to publish it
export IMAGE=ataias/swift:${SWIFT_VERSION}-focal}

echo "FROM swift:${SWIFT_VERSION}-focal" | \
 docker build -t $IMAGE-amd64 --platform linux/amd64 -

echo "FROM swiftarm/swift:${SWIFT_VERSION}-ubuntu-focal" | \
 docker build -t $IMAGE-arm64 --platform linux/arm64 -

The code above uses the existing docker images from swift and swift-arm to create two local images:

ataias/swift:5.5.2-focal-amd64
ataias/swift:5.5.2-focal-arm64

We can run those commands from a computer that has Docker enabled with QEMU. On an Apple M1, those usually just work. On Linux, you may need to some extra configuration, but should be simple to find it with your favorite search engine.

Create a manifest

What we did previously is nothing new. We just renamed the images, but they are still separate. We need to create a “manifest” which is what docker uses to define what different images a given tag may be connected to. That’s simple enough continuing our previous commands:

#!/usr/bin/env bash

export SWIFT_VERSION=5.5.2
export IMAGE=ataias/swift:${SWIFT_VERSION}-focal}

docker manifest create -a $IMAGE \
 $IMAGE-arm64 \
 $IMAGE-amd64

Publish it

If you are logged in to Docker Hub, you can publish it now:

docker manifest push $IMAGE

That’s it!

You can find the code for this section on the repo ataias/swift-docker-multi-arch on GitLab.

Multi-arch swift-format image built natively

The previous case was simple enough that we ran docker in a single machine with QEMU. However, there are cases where building on different machines is preferred or necessary. For instance, if you need to compile a non-trivial application.

The process could simply be like the previous one, something like:

Build an arm64 image on your arm64 machine
Build an amd64 image on your amd64 machine
Get both images on the same machine
Use docker manifest

That’s it. Another option is to run the commands from the same machine, but delegate the building of the non-native architecture to another machine. That’s what I did in the repo docker-swift-format, which resulted in a 56 lines script. The main difference from the approach that I just listed is the docker command to build:

#!/usr/bin/env

# this is an amd64 on my local network, set up in ~/.ssh/config with this name
export HOST_AMD64=ssh://lucy

docker --host $HOST_AMD64 \
 build \
 -t $PROD_IMAGE-amd64 \
 --build-arg BUILDER_IMAGE=$BASE_DEV_IMAGE \
 --build-arg RUNTIME_IMAGE=$BASE_PROD_IMAGE \
 --platform linux/amd64

You can look at the repo for more info, and you can also look a the image at docker hub.

Conclusion

We learned about:

what computer architectures are;
how to create multi-arch Docker images to make using different architectures seamless with docker;
how to create a swift amd64/arm64 image; and
how to create a swift-format amd64/arm64 image in a high level, with a repo link for more details.

That’s it. Hope you enjoyed it! Reach out if you have any feedback, positive or negative.

When docker run did not work on my shiny new computer#

What are computer architectures?#

Why do we have different computer architectures?#

What are multi-architecture docker images?#

How to build multi-architecture docker images?#

What do you need?#

Multi-arch swift image with QEMU#

Create arch specific tags#

Create a manifest#

Publish it#

Multi-arch swift-format image built natively#

Conclusion#