Site icon Craig Andrews

Always Use Docker Image Digests

Docker image digests are unique, immutable identifiers for container images. This means that two images with different contents will have different digests, even if they have the same name and tag. When you pull an image by its digest, you are guaranteed to get the exact same image every time, regardless of when or by whom it was pushed to the registry. This reproducible, consistent, guaranteed behavior provides significant benefits to security, reliability, and performance. Therefore, docker image digests should always be used – never reference a docker image without a digest.

Improved Security

Digests address the integrity portion of the CIA Triad by providing a unique immutable identifier for an image implemented using the SHA-256 cryptographic hash algorithm. In other words, if the image changes for any reason, the mathematics of cryptography guarantees that the digest also changes. For example, if a malicious actor changes a file within an image, the digest will change. Even the smallest of changes, such as adjusting a file permission on a single file results in a different digest. This property can be used to mitigate supply chain security concerns.

At Build Time

Without digest pinning, images can change. To demonstrate how that can happen, let’s use very simple Dockerfile that doesn’t use digest pinning:

FROM python:3.11.3-alpine

RUN echo 'print ("Hello World")' > hello.py

CMD ["python", "hello.py"]

Run it, and it works great:

docker build . -t hi

docker run -it hi

Since the python version is specified (3.11.3 is in the image tag), and the distribution (alpine) is also specified, one would think this tag would be safe and reliable to use.

However, recently, Alpine 3.19 based python docker images started being published, causing the python:3.11.3-alpine image to be based on Alpine 3.19. In this case, the image upon which our docker image is based changed in a very significant way, and since the Dockerfile didn’t change, there is no evidence of when, how, or why this change happened in source control. If this image was built on May 9, it uses Alpine 3.18; if it was built on May 11, it may use Alpine 3.18 or 3.19 depending on the state of docker image caches used by the system building the image.

Ineffective (or in this case, non-existent) change control is a significant security concern, introducing risks such as unexpected downtime, ineffective testing due to a lack of understanding of what changed, and an inability to respond to incidents due to the project being unable to be built. Imagine having to do a high-priority hotfix that requires a simple, one-line change, but for some reason, experiencing unrelated problems, and eventually, after many hours, tracking it down the base image being unexpectedly different. These concerns are key motivators for reproducible builds, and docker digest pinning is a key step toward achieving reproducibility. I discussed the importance of reproducible builds specifically with regards to Java in another article.

Another article which covers this problem is Mend’s Overcoming Docker’s Mutable Image Tags. It discusses a case where a mistake in a yarn release broke many node applications:

yarn support was broken in all latest Node.js Docker images due to a mistake in a Dockerfile refactor. The Node.js version hadn’t changed, so the image tags also remained unchanged, however the new image pushed for each tag was broken. This meant that Docker images that had previously worked (e.g. node:9 or node:8.10.0-alpine) suddenly stopped working the next time somebody or some machine such as CI pulled them.

Specific Tags Are Not Enough

In this specific case, a better image reference could have been used that also specifies the Alpine version: python:3.11.3-alpine3.18. However, using an insufficiently specific image tag is a very easy mistake to make. And not all images publish such highly specific tags as python does, so being so specific isn’t even always an option. Even when specific tags are an option and are used, image maintainers are not perfect and will sometimes make mistakes, publishing an image that surprisingly causes you trouble.

Using digest pinning, the Dockerfile above becomes:

FROM python:3.11.3-alpine@sha256:06a3f7b12a56ed051612416d3083814e145afc7fc3fc52a67aeffa348a20383c

RUN echo 'print ("Hello World")' > hello.py

CMD ["python", "hello.py"]

That digest is for the Alpine 3.18-based image. If the Alpine 3.19-based one should be used, the digest could be changed to sha256:4e8e9a59bf1b3ca8e030244bc5f801f23e41e37971907371da21191312087a07.

That change would go through change control, such as merge requests, peer reviews, automated security scans, approval processes, etc. Even if there are no such processes, at least there will be a record in source control that the change happened, making debugging and incident response vastly easier.

Moving on from accidental problems, let’s explore intentional ones in the form of supply chain attacks. What if a malicious actor creates a harmful version of the python:3.11.3-alpine image and manages to publish it to docker hub? Perhaps they managed to hijack the python project’s docker hub publishing credentials, or maybe they even compromised docker hub’s infrastructure. In this scenario, users who pull python:3.11.3-alpine will start getting the malicious image.

However, users who use digest pinning will not suddenly switch to this malicious image, they’ll continue using the same image as before until they intentionally change. python:3.11.3-alpine@sha256:06a3f7b12a56ed051612416d3083814e145afc7fc3fc52a67aeffa348a20383c will still point to the same, non-compromised docker as it always has, regardless of how compromised the docker registry becomes. As before, if the project notices a new image is available, they will have control over when they use it, instead of being at the mercy of unpredictable factors such as when images are published or what state caches are in.

Remind Engineering: Docker images digests provides another example of this supply chain attack scenario also indicating that digests mitigate it.

Improved Security at Run Time

The same concerns apply at runtime as to build time. Consider a kubernetes helm chart that references the nginx docker image to run a static web server in a container using 1.25.0-alpine as the image reference. The same concerns apply, such as the alpine version changing unexpectedly, or a supply chain vulnerability allowing a malicious actor to publish a dangerous image with that tag.

Compliance, Standards, and Industry Practice

Nothing operates in a vacuum – we all stand on the shoulders of giants, all members of a global community. Therefore, it’s important to consider the opinions of other authorities with regard to this topic.

GitHub’s Demonstrating end-to-end traceability with pull requests doesn’t directly mention implementation details such as docker images, but it does discuss the importance of change traceability, which is what using image digests is all about. The article also cites industry standards, including PCI/DSS, ISO 27001 (A.12.1.2, A.14.2.2, and A.14.2.3), and COBIT. Having software change without any records or change control process, as happens when docker image digests are not used, does not comply with those standards. And in many regulated environments, compliance with these standards is not optional.

GitHub Actions advises users of its services to Pin actions to a full-length commit SHA:

Pinning an action to a full length commit SHA is currently the only way to use an action as an immutable release. Pinning to a particular SHA helps mitigate the risk of a bad actor adding a backdoor to the action’s repository

Don’t use dynamic versions for your dependencies doesn’t discuss docker images specifically, but does cover the general topic of dependency versions. Here’s a particularly salient excerpt:

Bugfix updates should be safe and absorbed as quickly as possible.
In other words, if you’re using 23.0.+, it should never break because the patch version should only update for bug fixes.
There are two problems with this idea. First, you have to trust that the library is using versions to denote patches correctly. And second, you have to trust that the developer never writes a bug into a patch version. Two big ifs; not worth the risk.

IBM’s Choosing image tags or digests discusses the use of image digests:

Image tag mutability is useful and convenient in many scenarios, but it can also be dangerous if you are not aware and prepared to manage it.

Prisma Cloud’s Checkov/Bridgecrew has the Ensure images are selected using a digest rule that alerts users when an image reference is found that is missing a digest.

The common thread across these standards documents and publications from industry giants is that having software change without any record or process, regardless of the mechanism (including through using mutable docker tags), is bad practice.

Reliability and Performance

In addition to security, reliability and performance can also be improved by using docker image digests.

Prisma Cloud’s Checkov/Bridgecrew’s Ensure images are selected using a digest rule gives one example of how digests improve reliability:

Pulling using a digest allows you to “pin” an image to that version, and guarantee that the image you’re using is always the same. Digests also prevent race-conditions; if a new image is pushed while a deploy is in progress, different nodes may be pulling the images at different times, so some nodes have the new image, and some have the old one.

If a digest is not specified, the client can choose to query the registry to see if the requested tag has been updated. Kubernetes’ documentation for container images (to which I submitted the improvement Clarify image pull policy regarding digests) provides a warning and a recommendation for digests to be used:

When using image tags, if the image registry were to change the code that the tag on that image represents, you might end up with a mix of Pods running the old and new code. An image digest uniquely identifies a specific version of the image, so Kubernetes runs the same code every time it starts a container with that image name and digest specified. Specifying an image by digest fixes the code that you run so that a change at the registry cannot lead to that mix of versions.

Another effect of not using digests is that when clients do reach out to the registry, the registry may be unavailable. For instance, the client may be offline, there could be a networking issue between the client and the registry, or the registry could be down. In any case, if the client tries to see if the requested tag has changed and cannot reach the registry, it may fail with an error. If a digest was used instead of just a tag, the client would not need to reach out to the registry (assuming the client already has the image).

Finally, reaching out to the registry takes time. When digests are used, fewer requests to the registry need to be made, reducing image pull time. In normal operations, the time saved may be on the order of milliseconds, but if the registry is down, if the client can handle the registry being unavailable, the client may still take many seconds to timeout.

Arguments Against Digest Pinning

Sonar Formerly Advised Against the Use of Docker Image Digests

SonarQube has a rule which discourages the use of digests in Dockerfiles: S6497: Using a container image based on its digest is security-sensitive. The argument provided in the rule documentation is that using digest “prevents the resulting container from being updated or patched in order to remove vulnerabilities or significant bugs.” I believe I thoroughly debunked that argument earlier in this article. I also raised these concerns about this rule in the Sonar Community with my post at “S6497: Pulling an image based on its digest is security-sensitive” is harmful to security. After a thoughtful, productive exchange, Sonar has decided to remove rule S6497 from the default Sonar configuration.

Keeping Digests Up-to-Date is Challenging

Sonar’s rule expresses the concern that the use of digests prevents automatic updating, moving the onus of keeping image references up to date onto the developer. Dependency updating is not a new problem: Developers have been keeping other types of dependencies up to date for many years. For example, Maven, Gradle, NPM, .NET, Go, Rust, and (more or less) every other build system requires (or at least strongly recommends) using exact, reproducible dependency versions. Therefore, tools exist to solve this problem.

Mend Renovate is a tool that keeps dependencies up to date. It continuously scans projects, opening pull requests as it finds updates for dependencies. For instance, when it finds a docker image reference, it checks to see if the tag has been updated to point to a new digest, and even if a new later versioned tag exists. It supports many dependency management systems (including helm charts), and has a way to provide regular expressions so Renovate can find and update dependencies (such as docker image references) that aren’t expressed in supported dependency management systems. Renovate supports github.com, gitlab.com, GitHub Enterprise, and GitLab self-hosted. I previously discussed this topic in The How and Why Automating Dependency Updates.

Dependabot is a similar tool provided by GitHub.

I reported that the Iron Bank UI should include image digest in all image references which started an interesting, healthy debate. I believe this article and my response in the issue debunks all of their arguments; unfortunately, Iron Bank’s final response came down to what I interpret as their belief that it’s too much work and them terminating further discussion with an “escalate to management” deflection.

Wrapping Up

Image digests are a powerful tool for ensuring the consistency of your container images providing an easy way to improve security, reliability, and performance. The software industry learned many years ago that dependency versions must be locked – docker images are just another type of dependency that must be treated similarly.

If you’re aware of a validation tool that doesn’t report missing digests or a project that doesn’t use digests, then please report such issues to improve security, reliability, and performance for all users – and please let me know so I can follow along.

Always Use Docker Image Digests by Craig Andrews is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Exit mobile version