Chapter 16 Container Images
This chapter dives more into what images are for containers. Images enable quick sharing of Shiny applications, by bundling all the configurations and dependencies required to run your Shiny application.
Images are shared through container registries, and only need to be built once, saving time from installing dependencies and configuring the Shiny application each time it is run on a new computer.
16.1 Working with Existing Images
Let’s learn how to work with an existing image. Such an image is stored in a container registry where we can pull it from if we know its name.
16.1.2 Pulling an Image
You can use the docker pull <image-name> command to pull an image from
a public registry. For example docker pull ubuntu:24.04 will pull the
24.04 version of the “official” Ubuntu image from the Docker Hub.
docker pull rocker/r-base:4.4.1 will pull the image with R version 4.1.1.
Pull the R Shiny version of the Old Faithful as:
docker pull ghcr.io/h10y/faithful/r-shiny
# Using default tag: latest
# latest: Pulling from h10y/faithful/r-shiny
# Digest: sha256:12e[...]4eaYou can see from the messages that the latest tag was applied because
we did not specify the tag. We can also see the SHA256 digest, that is
a unique and immutable identifier. The name can change, or multiple names can
refer to the same image (i.e. a set of layers and their manifest). But the
image digest will be the same. To “pin” the exact version, you can use the
<image-name>@sha256:12e[...]4ea pattern (use the actual digest copied from
your screen without the [...]):
To pull all images from a repository, you can use the --all-tags flag:
This will pull not only the latest, but also the image tagged as main
named after the Git branch. Use the docker images command to list the
images.
16.1.3 Docker Login
You don’t need to authenticate for public images, but in case you are trying to pull a private image from a private repository, you need to log into the container registry. Such private repositories are common and are available on Docker Hub, the GitHub or GitLab container registries. More on the different container registries later.
To log in to the GitHub container registry, use:
This command will ask for our credentials interactively. If you want, you can provide your username and password. But it is usually recommended to use an access token instead of your password because the token can have more restricted scopes, i.e. only used to (read) access the container registry which is a lot more secure. You can also set expiry dates and can revoke these tokens any time without having to change login passwords elsewhere.
Let’s say that you saved your GitHub token value in a file ~/my_token.txt in
the root of your home folder (~). You can pass the PAT value to the
docker login command via the standard input as:
where <username> is your GitHub username.
16.1.4 Running a Container
The next command is docker run which runs a command in a new container.
It pulls the image if needed before starting the container.
Try the following command. It will pull the latest image for the Python build of the Old Faithful example app, then it will start a new container:
The -p is a shorthand for --publish that instructs Docker to
publish a container’s port to the host port. In our example, 3838 is the
container’s port which is mapped to port 8080 of the host machine. As a
result, you can visit http://127.0.0.1:8080 in your browser to see the
Python Shiny app. Hit CTRL+C in the terminal to stop the container. We will
learn about container ports in a bit, but in essence it is just a channel that
is used to send information back and forth.
16.2 Building a New Image
So far you saw how to use the basic Docker commands to pull and run images. Now you’ll build a Docker image by recreating the Old Faithful Shiny app that we worked with before.
In our examples, we will use the following setup: a file named Dockerfile
sits next to a folder named app, and the Shiny app files like
app.R or app.py are in this folder. This setup is convenient
because we can copy all the files from the app folder without
having to worry about copying files that should not be there.
You can follow along the examples by downloading or cloning the GitHub repository
with git clone https://github.com/h10y/faithful.git. All the different builds
of the Old Faithful app from Chapter 5
will have a Dockerfile and instructions in the README.md files within each
folder.
16.2.1 R for Shiny
For our R Shiny example within the r-shiny folder,
this is what is inside the Dockerfile:
FROM rocker/r2u:24.04
RUN R -q -e "install.packages('shiny')"
RUN groupadd app && useradd -g app app
WORKDIR /home/app
COPY app .
RUN chown app:app -R /home/app
USER app
EXPOSE 3838
CMD ["R", "-e", "shiny::runApp(host='0.0.0.0', port=3838)"]We will explain the Dockerfile instructions in the next section.
For now, you can use the docker build command to build the image from the
Dockerfile. You will have to be in the same directory as the Dockerfile,
this place is what we call the build context.
This is what the . at the end of the command stands for:
The context here specifies the current directory (.), but it can be
any relative or absolute filepath. Files and directories inside the
context directory are available to the builder, so it can load them when needed.
You can use a .dockerignore file to list files and directories that should
be ignored within the build context. It is similar to the .gitignore file.
The instructions are taken from the Dockerfile at the root of the build
context. If you want to specify a different file, do so by providing the
path to the file using the -f (or --file) option as docker build -f Dockerfile2 ..
The -t argument (same as --tag) is followed by the image name
(r-shiny-test) and the tag (v1). If you do not specify the image
name/tag at image build (i.e. docker build .), Docker will not tag the image
but it will have an image ID that you can use later to tag the image with
docker tag <image-id> r-shiny-test:v1.
You can apply multiple tags as:
16.2.2 Buildx and BuildKit
While the builder is running, you’ll see lots of messages printed
as Docker goes through the instructions from the Dockerfile.
As of Docker Engine 23.0 and Docker Desktop 4.19, Buildx is the default build
client and user interface. Buildx brings extended build capabilities with
BuildKit. BuildKit is the server that handles the build execution, e.g.
it communicates with registries, instructs the Docker Engine and accesses
the local file system. You can enable the use of BuildKit on older Docker
systems by setting the environment variable DOCKER_BUILDKIT=1.
The Buildx output is nicer and it provides you with timings for every step
of your Dockerfile:
[+] Building 32.4s (12/12) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 282B 0.0s
=> [internal] load metadata for docker.io/rocker/r2u:24.04 1.2s
=> [auth] rocker/r2u:pull token for registry-1.docker.io 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [1/6] FROM docker.io/rocker/r2u:24.04@sha256:f327[...]dd73 9.2s
=> => resolve docker.io/rocker/r2u:24.04@sha256:f327[...]dd73 0.0s
[...]
=> [internal] load build context 0.0s
=> => transferring context: 845B 0.0s
=> [2/6] RUN groupadd app && useradd -g app app 0.7s
=> [3/6] RUN R -q -e "install.packages('shiny')" 20.9s
=> [4/6] WORKDIR /home/app 0.0s
=> [5/6] COPY app . 0.0s
=> [6/6] RUN chown app:app -R /home/app 0.1s
=> exporting to image 0.3s
=> => exporting layers 0.3s
=> => writing image sha256:4d10[...]bab7 0.0s
=> => naming to docker.io/library/r-shiny:v1 0.0s
Sometimes you want to inspect the output and do not only want the collapsed
output. Add the --progress=plain to the build command to see all the
output. This comes handy when troubleshooting the build.
BuildKit also offers other nice features, for example setting the target
platform(s) for the build via the --platform option.
The default value is the platform of the BuildKit daemon where the build runs,
i.e. your laptop or a server. This can be important for Mac OS X users
on Apple Silicone (M1 and above), because the default ARM64 build will
have poor performance or might fail on other platforms on AMD64 machines.
Use the --platform=linux/arm64 to build the image for AMD64 architecture.
You can also build for multiple architectures at once with
docker build --platform linux/amd64,linux/arm64 ..
See 3.1.4 for enabling virtualization on Mac OS X
to enable builds for multiple platforms.
16.2.3 Inspecting the Image
The output of the build is an image that has a SHA256 hash that can be used
as a unique identifier. The image is made up of image layers. These layers are
created by the instructions from the Dockerfile. If you run the build command
again you will notice that instead of 32 seconds, it will take almost no time
to build the image. This is because the layers are cached by default and Docker
smartly evaluates which instructions and files have changed since the last
build. Sometimes the cache gets tangled, or you just want to make sure that
the error is not a caching issue. In this case use the --no-cache flag
with docker build.
You can use the docker history r-shiny:v1 command to see how the image was
built and you can see the sizes for every layer. Intermediate layers
have a size of 0B and these do not contribute to the overall image size.
The layers created 2 hours ago are the layers we created, the layers
created 2 weeks ago are the layers from the parent image rocker/r2u:24.04,
whereas the layers created 2 months ago are the official ubuntu:24.04 image
layers that form the parent image of the rocker/r2u:24.04 one:
IMAGE CREATED CREATED BY SIZE
4d[...]52 2 hours ago CMD ["R" "-e" "shiny::runApp(host='0.0.0.0', 0B
<missing> 2 hours ago EXPOSE map[3838/tcp:{}] 0B
<missing> 2 hours ago USER app 0B
<missing> 2 hours ago RUN /bin/sh -c chown app:app -R /home/app # 780B
<missing> 2 hours ago COPY app . # buildkit 780B
<missing> 2 hours ago WORKDIR /home/app 0B
<missing> 2 hours ago RUN /bin/sh -c R -q -e "install.packages('sh 109MB
<missing> 2 hours ago RUN /bin/sh -c groupadd app && useradd -g ap 5.14kB
<missing> 2 weeks ago RUN /bin/sh -c apt-get update && apt 642MB
<missing> 2 weeks ago ENV TZ=UTC 0B
<missing> 2 weeks ago ENV DEBIAN_FRONTEND=noninteractive 0B
<missing> 2 weeks ago ENV LANG=en_US.UTF-8 0B
<missing> 2 weeks ago ENV LC_ALL=en_US.UTF-8 0B
<missing> 2 weeks ago RUN /bin/sh -c useradd -s /bin/bash -m docke 81.6MB
<missing> 2 weeks ago LABEL org.label-schema.license=GPL-2.0 org.l 0B
<missing> 2 months ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0B
<missing> 2 months ago /bin/sh -c #(nop) ADD file:ac9d5a9d5b9b1217 76.2MB
<missing> 2 months ago /bin/sh -c #(nop) LABEL org.opencontainers. 0B
<missing> 2 months ago /bin/sh -c #(nop) LABEL org.opencontainers. 0B
<missing> 2 months ago /bin/sh -c #(nop) ARG LAUNCHPAD_BUILD_ARCH 0B
<missing> 2 months ago /bin/sh -c #(nop) ARG RELEASE 0B
The docker inspect r-shiny:v1 returns a long JSON output that is the
metadata of the image. It also has the SHA256 hash of the image.
Here is the greatly simplified output:
[
{
"Id": "sha256:4d10[...]bab7",
"RepoTags": ["r-shiny:v1"],
"Created": "2024-07-05T04:59:01.123398172Z",
"Config": {
"User": "app",
"ExposedPorts": {"3838/tcp": {}},
"Cmd": ["R","-e",
"shiny::runApp(host='0.0.0.0', port=3838)"],
"Volumes": null,
"WorkingDir": "/home/app",
"Entrypoint": null,
},
"Architecture": "amd64",
"Os": "linux",
"Size": 909132976,
"Metadata": {
"LastTagTime": "2024-07-05T06:20:22.2764725Z"
}
}
]
Once the docker image is built, you can run the container to make sure the app is working as expected:
16.2.4 Python for Shiny
You can find the Python for Shiny example in the py-shiny folder of the
Old Faithful example repository. The Dockerfile for the Python version looks
like this:
FROM python:3.9
COPY app/requirements.txt .
RUN pip install --no-cache-dir --upgrade -r requirements.txt
RUN groupadd app && useradd -g app app
WORKDIR /home/app
COPY app .
RUN chown app:app -R /home/app
USER app
EXPOSE 3838
RUN mkdir .config
ENV MPLCONFIGDIR=/home/app/.config
ENV HOME=/home/app
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "3838"]We’ll explain each line shortly. To build and check the Docker image, use the following commands:
export DOCKER_DEFAULT_PLATFORM=linux/amd64
docker build -t py-shiny:v1 .
docker run -p 8080:3838 py-shiny:v1The DOCKER_DEFAULT_PLATFORM environment variable is not strictly necessary,
but it can save you some headaches on Mac OS X when the platform for the parent
image is not matching the local ARM64 architecture of your Apple Silicone.
16.3 Managing Images
There are a few commands that you need to know to manage your Docker images in the absence of the Docker Desktop graphical user interface. This will pay off later when you have no such luxuries on a server.
To list the Docker images, use the docker images command. It will give you
a quick summary of the images:
REPOSITORY TAG IMAGE ID CREATED SIZE
py-shiny v1 ed11a2980c07 5 seconds ago 1.24GB
r-shiny v1 4d10f42d6a52 About an hour ago 909MBSize is the space taken up by the image and all its parent images.
You can filter the output, for example docker images --filter=reference="py-*"
will give you images whose name starts with py-, whereas
docker images --filter=reference="*:v1" will list images that are tagged with
v1.
Use the docker rmi <image-name-or-id> to remove an image based on its name or
the image ID.
The docker system df command will give you a concise summary of disk usage
by the Docker daemon including images, containers, and volumes:
TYPE TOTAL ACTIVE SIZE RECLAIMABLE
Images 2 1 1.457GB 1.355GB (98%)
Containers 1 1 0B 0B
Local Volumes 0 0 0B 0B
Build Cache 117 0 3.005GB 3.005GBIf you build images during development while keeping the image name and tag the
same you will end up with “dangling” images that are untagged and are not used
any longer. Dangling images can accumulate over time and can fill up the
available space that Docker Desktop is allocating for images.
Use docker system prune to clean up these dangling images.
The command docker system prune --all will remove all unused images and containers.
The Docker Desktop uses a finite amount of disk space that can fill up. Do the cleanup or go to the Docker Desktop settings and under Resources you should be able to change the virtual disk limit. You can check the RAM, CPU, and disk usage by looking at the bottom of the Docker Desktop window.
16.4 Sharing Images
As we saw, Docker images are just compressed files linked by metadata.
You should be able to copy these files and move them around.
The docker save command lets you save an image to a compressed tar file:
Next, you take this tar file, copy it to another server and load it with:
This restores both the image and the tags.
Now imagine that you are managing more than two machines, or you want to share the Docker image with others so that they can use it or to serve as a parent image. The save/copy/load workflow becomes cumbersome quickly. In this case, using a registry might be a much better idea. There are many options to choose from, and you can even host your own registry.
16.4.1 Pushing Images
Let’s tag the r-shiny image so that it has a host defined:
Now we can push the locally built Docker image to a container registry:
Note that this command will not work on your machine because you do not
have write access to the ghcr.io/h10y/faithful repository.
You need to create an image name that would let you push to your own personal
Docker Hub account as an example.
The image tag should start with the registry name unless you are pushing
to Docker Hub. When the image tag is not specified, Docker will treat
the new image as :latest automatically.
16.4.2 Docker Registries
A Docker registry stores Docker images. This is where we push images to and pull
images from. Docker Hub
is a public registry and Docker is configured to look for images on Docker Hub by
default. Docker Hub is a service provided by Docker for finding and sharing
container images. The canonical host name for Docker Hub is docker.io. This
is the default registry when you don’t specify a registry host as part of the
image name.
There are many other registries out there besides Docker Hub. Here is a non-exhaustive list of options.
The GitHub Container Registry (GHCR) is available as part of GitHub Packages
for free and paid plans, even for private repositories under the free
plan. This registry requires no authentication for public images, otherwise
you have to authenticate using your GitHub token. The visibility of the
images inherits the repository visibility but can be changed by the owner.
The host name for GHCR is ghcr.io.
An alternative to GitHub is GitLab (host name registry.gitlab.com),
that has provided registry support for its free (public and private) repositories
long before GitHub. The registry is tightly integrated with GitLab’s CI/CD
pipelines. This registry also needs login with a token for private images.
Heroku is a platform provider and it also comes with a Docker
registry (host name is registry.heroku.com) where the Docker-based deployments
push the images to.
Every major cloud provider offers a Docker container registry that is integrated with their other offerings. Latency should be minimal due to network proximity to the servers:
- Amazon Elastic Container Registry
- Azure Container Registry
- Google Container Registry
- DigitalOcean Container Registry
Other common alternatives for container registries include the JFrog Container Registry, Harbor, and Scaleway.
Although these services are called “container registry”, but strictly speaking they store container images.
16.4.3 Log In to a Registry
When you work with private registries or private images, you need to log
in with the docker login command. For Docker Hub, just type
docker login. For all other registries, type in the registry URL as
well, e.g. docker login ghcr.io.
The Docker CLI then will prompt you for your username and password (or access token).
You can log in programmatically by providing your username and the password through standard input from a file:
The my_password.txt in this example is is a simple text file with the token
inside and it can be found in the root of your home folder (~). Change the
file path and file name as needed.
You can also use an environment variable to store your token value that you can pass to the login command as:
Notice the white space before the export statement, use double spaces so that
the command after the spaces will not be saved in your shell history.
The history allows you to recall previous commands by pushing the up arrow
key. The shell history is really just a text file, so copy pasting secrets
into the terminal will leave a trace. Use this trick for sensitive information.
With one of these approaches you can log into any public or private repository
for which you have credentials. The credentials will be stored
locally in $HOME/.docker/config.json on Linux and Mac or in
%USERPROFILE%/.docker/config.json on Windows. After login, there is no
need to re-authenticate until you log out with docker logout.
It is always a good idea to use a token instead of your password. Tokens can have limited scope (i.e. only for pulling images), and can be revoked at any time without it impacting other areas of your life.
Note that docker login requires users to use sudo or be the root user.
16.4.4 Local Registry
You might not want the Docker images to leave your computer because you need an air gapped environment, or you are setting up a registry within your virtual private network (VPN). In these situations, you can host your own container registry.
If you want a registry hosted on your machine, just pull the registry image. The next command will pull the registry image, and run the similarly named container in the background on port 5000:
Giving a container a name makes it easier to remove the container later, this
way you don’t have to find the container ID. The restart policy always restarts
the container if it stops, but not when it is manually stopped.
The -d flag will start the container in a background process, so you get back
the shell prompt, and you will not see the container log messages.
Tag an image with the host name of your local registry, localhost:5000, and
push the image:
To test if it worked, remove the images from your local Docker system.
If you use the -f flag and specify the image ID then the docker rmi
command untags and removes all images that match that ID (get the image ID
from docker images):
Now you can pull the image from your local registry:
The next command stops and removes the registry container. It is a daemonized
(background) process, so CTRL+C won’t work. The -v option makes sure to remove
anonymous volumes associated with the container which is often used to mount a
volume from your hard drive into the container where the images are stored:
If you want your registry to be accessed over a public network, then you need to think about security and access control. You’ll have to set up transport layer security (TLS) for HTTPS and user authentication, which are advanced topics and we recommend using a commercial container registry that we listed above and use private repositories to control access to your images.