At AnnounceKit we use Kubernetes for our cluster management. We first deployed on GCP using completely managed Kubernetes service and then switched to AWS using EKS which is the managed Kubernetes service of AWS.
Given that we are now running on AWS and they have been heavily pushing their ARM based server instances. We decided to give it a try and see how things work. (Our recent switch to M1 mac computers also has a small role in this push). Here’s how we approached the issue;
Docker Images
The obvious issue regarding to this switch is the fact that we need to build our docker images for ARM64 architectures. We have been using GitHub Actions to build our x86 images so naturally that is where we looked at for a solution.
On GitHub actions front, you can either use a self hosted action runner. They support ARM architectures and might be the best solution if you wish to create only ARM64 builds.
We wanted to create multi arch images at first to make it easier for us to handle the switch and be able to roll back things fast. So we needed to use the new build pipeline that can leverage qemu for arm emulation. It is as simple as adding docker/setup-qemu-action
and docker/setup-buildx-action
then specifying platforms
on the build push action.
Here’s our current GitHub actions snippet for building both x86 and arm images:
jobs:
build:
steps:
- uses: actions/checkout@v2
- name: Set up QEMU
uses: docker/setup-qemu-action@v1
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
- name: Build and push
uses: docker/build-push-action@v2
with:
context: .
platforms: linux/amd64,linux/arm64
push: true
tags: |
${{ secrets.AWS_REPO }}:${{ env.GITHUB_SHA }}
${{ secrets.AWS_REPO }}:latest
cache-from: type=registry,ref=${{ secrets.AWS_REPO }}:latest
cache-to: type=inline
It is this simple. Now, whenever we build the images, the image tags have layers available for multiple platforms. You can run the same image on an x86 machine or an arm machine. So, we are getting close.
Kubernetes
As mentioned before, we use Kubernetes for cluster management and until recently, we had an EKS cluster with a single managed node group of good old x86 instances. The EKS clusters (at least the recent versions) are completely ready to be used with both arm and x86 instances. You can simply add a graviton based instance group and the nodes will start chugging along.
That was what we’d do, except, we have a couple kubernetes deployments that we can not build for ARM. We might not have control over the base image or binary blobs going into the build. In any case, we needed some auxillary x86 capacity for our incompatible pods. This means we’ll be running a hybrid cluster and that means we need to somehow schedule pods on correct architectures.
There are multiple ways to do this. You can use pod affinities, taints, label selectors and probably some other methods. We will go with the simplest way, using label selectors.
Each node in your cluster has a set of predefined lables, provided by the platform. You can also customize your own labels but we already have the necessary information available. Just by using kubect describe nodes
, we can see the built in labels applied to our nodes:
Labels: Name=announcekit-******************
beta.kubernetes.io/arch=amd64
beta.kubernetes.io/instance-type=******************
beta.kubernetes.io/os=linux
eks.amazonaws.com/capacityType=ON_DEMAND
eks.amazonaws.com/nodegroup=******************
eks.amazonaws.com/nodegroup-******************
failure-domain.beta.kubernetes.io/region=us-east-1
failure-domain.beta.kubernetes.io/zone=us-east-1a
kubernetes.io/arch=amd64
kubernetes.io/hostname=******************
kubernetes.io/os=linux
node.kubernetes.io/instance-type=******************
topology.kubernetes.io/region=us-east-1
topology.kubernetes.io/zone=us-east-1a
Using these lables, you could schedule your pods to a specific OS, a specific region or zone and to our luck, to a specific architecture. kubernetes.io/arch=amd64
tells us that this node is running on amd64. The graviton instances will show up as arm64
here.
It is now as easy as targeting the specific label on your pod specs as in
apiVersion: apps/v1
kind: Deployment
metadata:
name: announcekit
spec:
replicas: 1
selector:
matchLabels:
app: announcekit
template:
metadata:
labels:
app: announcekit
spec:
nodeSelector:
kubernetes.io/arch: amd64
containers:
- image: cilium/echoserver
imagePullPolicy: Always
name: announcekit-cname
ports:
- containerPort: 80
restartPolicy: Always
The important part here is the nodeSelector
. Here it is denoted that this deployment should run its pods on amd64 architecture. If there are no available instances with the given label, the pods will fail to schedule.
From here on, it is as simple as targeting specific pods to specific architectures. And starting with multi architecture builds, it is as simple as flipping the selector configuration.