Immutable containers and trash-talking tags

This is gonna get geeky

I have been weirdly obsessed with immutability recently. I’m going to do my best to explain why, and how I’ve solved it. Partly for me so I can re-read this in a year, but also partly because I keep seeing people not bothering and it’s bothering me.

I’m not going to explore why it’s bothering me. Y’all get tech or emotional labour, and right now it’s tech.

Write once, run everywhere. That was the promise of Sun Microsystems when they came up with the cross-platform Java Virtual Machine. That was way back in the 90s, and it promised an exciting new way to write code: code that deployed itself into a virtual machine and therefore could be run on any operating system

You can imagine how that went when I tell you that before long Java developers had repackaged that idea as “Write once, debug everywhere”. Still, we’ve had thirty years. Things must have got better.

No. Actually, I think in some places they’ve got worse. In particular, Docker. If I’m being particularly particular, Docker plus my absolute bugbear: a tag that just says ‘latest’.

When nerds like me talk about continuous integration and continuous deployment, we’re talking about two slightly different things:

continuous integration is making sure that I don’t go off on a massive tangent: I have to combine my code back into the main branch, and get everyone else’s code off the main branch, at least once a day. It makes sure we’re all moving in the same direction
continuous deployment means that code is deployed without human interaction into the production environment. No touching, no tweaking

One of the ways we should achieve this is through immutable builds. Immutable means ‘unchangeable’. Advances in technology over the last thirty years means we can now deploy the infrastructure to support our code, as well as our code. In the old days, we had to put code onto servers. Now we can code servers into existence, and tear them down when we don’t need them any more.

Into this idyllic world came two things that threaten the bucolic atmosphere: docker build and :latest.

docker build is the command we use to turn a recipe of code and infrastructure into an artefact ready to be deployed into a production environment and make users happy. :latest is the tag that is added automatically to each artefact before it’s sent on its merry way. Combined, they make immutable builds very tricky. Let’s start with why :latest is a horrible idea:

Imagine you are putting stuff in a box, and you label every box with the name of your project:

a cardboard box that says 'important code'

and then at some point someone comes down and says it’s all gone wrong and you need to go back to how you were doing before it went wrong. At this point you’re in trouble because there’s no way of knowing which box marked ‘important code’ is the right one.

Tagging each box as you go with ‘latest’ does not help you at all.

a cardboard box. On one side it says 'important code'. On another side it says 'latest' — Perhaps software people should spend time working with committees, because I’ve done my time and I know for a stone-cold-fact that two people reading `2022-01-01 business case -FINAL`_FINAL_jk_edits-FINAL are absolutely guaranteed to be reading different documents.

And the same goes for tagging things ‘staging’ and ‘dev’. That doesn’t help either! In fact, that’s worse, because then I suspect you’re re-building the box between stages. And when your Dockerfile looks like this:

RUN apk add --update --no-cache tzdata && \
    cp /usr/share/zoneinfo/Europe/London /etc/localtime && \
    echo "Europe/London" > /etc/timezone

RUN apk add --update --no-cache --virtual runtime-dependances \
 yarn postgresql-contrib libpq less postgresql-dev shared-mime-info

RUN apk add --update --no-cache --virtual build-dependances \
 build-base  && \
 bundle install --jobs=4 && \
 apk del build-dependances

…which, if I clean it up a bit, actually reads:

DOWNLOAD a bunch of random stuff from the internet, I dunno, whatever man

ALSO here I'd just like whatever some random bloke has added to his repo overnight while he was way too tired

DOWN HERE honestly buddy is there really that much difference between version 1 and version 2? Just lay it on me

I’m getting concerned that the immutable build object you’re supposedly passing along your deployment pipeline is actually three different objects that you’re just putting in the same clothes.

an elephant, a bus, and a cardboard box. All have 'important code' and 'latest' very badly photoshopped onto them — pictured: your build pipeline

The solution is to tag your artefact with something you can come back to. I like to use the hash of the commit that you’re building, because then you can mostly associate a build artefact with the code used to create it. Plus, it makes it easy peasy to deploy, even if you’re still using pull requests and haven’t ascended to the heights of trunk based development.

When you make the PR, the head of that branch is used to create a build artefact. Let’s say the hash is 2f4e702, so you build my-awesome-code:2f4e702 and deploy it to a registry. Your test suite should now pull that exact artefact. If it passes all of its tests then your code will be merged. At this point a new commit is created – let’s say 152da0a.

Obviously you don’t want that commit, which will have a different hash – you want one of its parents. You can access that through HEAD^2 : the second parent of the current commit. Now you’ve got the commit that refers to the immutable object that you know has passed its tests. Grab it from your favourite container registry and you’re ready to deploy the immutable, known-good artefact into production.

Right now this is my best effort. Can you do better? Make improvements? Congratulations! You might have learned something. But more than that, you’ve made a friend for life in me.

Caffeinated Punctuation

Words from my life

Menu

Immutable containers and trash-talking tags

Leave a comment Cancel reply

Menu

Share this:

Leave a comment Cancel reply