Enabling GPUs with NVIDIA Docker Container Runtime

Published in

AVA Information

4 min readAug 4, 2021

In this post, we will go through one of the easiest ways to enable NVIDIA GPU CUDA capabilities from Docker containers and Docker Compose for boosting AI/ML model performance

AVA’s Data Science team uses Machine Learning models for training over large volumes of data on daily basis. This is a time and resource-intensive process that requires a lot of computing power.

When it comes to this task, Docker is one of the tools that can make a developer’s life so much easier. When it comes to development, deployment, and running applications by using containers, Docker has become a standard requirement for almost all of the projects I work on. Docker containers are platform-agnostic, but also hardware-agnostic. This presents a problem when using specialised hardware such as NVIDIA GPUs which require kernel modules and user-level libraries to operate. Other aspects like the CPU drivers are pre-configured, but using GPUs requires some additional setup. Although enabling GPU support from Docker can be done in different ways, here we will go through one that is, from my experience, definitely one of the easiest.

The tutorial that follows is linked to a previous article we published about the dockerization of Airflow 2.0, which you can use to schedule and build Machine Learning pipelines or any other job/task. Adding GPU support on top of that and you get a powerful platform for managing various workflows.

Prerequisites

Your host machine must satisfy the following prerequisites to begin using NVIDIA Container Runtime with Docker. To avoid any unwanted errors, make sure to double-check versions. This setup is tested on a Linux host machine.

Now let’s quickly go through the setup:

1. Docker (version 1.12 or higher)

To check the docker version:

docker -v

Docker official installation instructions

2. Docker Compose (version 1.19.0 or higher)

To check the docker compose version:

docker-compose -version

In case you need to upgrade the docker compose version (1.27.4 in this example), run the following commands:

3. NVIDIA driver (latest version is preferred, but not always mandatory)

To check if the driver is present and version, run:

nvidia-smi

If the driver is not available, use the installer from Nvidia’s official driver download site.

The Setup

1. Installing NVIDIA Docker Container Runtime on a host machine

First, we have to setup the nvidia-container-runtime repository for your os. Commands are different depending on the distribution as written here. The tip for Linux Mint installation is here. I guess most readers are using Ubuntu, if so, then run:

Next up wee need to installnvidia-docker2 package and reload the Docker daemon configuration. nvidia-docker is essentially a wrapper around the docker command that transparently provisions a container with the necessary components to execute code on the GPU.

sudo apt-get install -y nvidia-docker2
sudo pkill -SIGHUP dockerd

Now to install NVIDIA Container runtime, simply run:

sudo apt-get install nvidia-container-runtime

Finally to verify that NVIDIA driver and runtime have installed correctly:

sudo nvidia-container-cli --load-kmods info

The output should look similar to the following:

NVRM version:   396.26
CUDA version:   9.2

Device Index:   0
Device Minor:   2
Model:          Tesla V100-SXM2-16GB
GPU UUID:       GPU-e354d47d-0b3e-4128-74bf-f1583d34af0e
Bus Location:   00000000:00:1b.0
Architecture:   7.0

Running GPU docker containers

docker run -it --runtime=nvidia --shm-size=1g -e NVIDIA_VISIBLE_DEVICES=0 <YOUR_IMAGE_NAME>

2. Docker compose setup for a host machine

Modify daemon.json file at /etc/docker/daemon.json. If the file does not exist, it should be created. File with no additional config would look like below:

Optionally, you can set Nvidia runtime to be the default runtime by adding: "default-runtime": "nvidia" to daemon.json and not worry about forgetting to specify runtime again.
To check if nvidia runtime is present, run:

docker info|grep -i runtime

The output should look something similar to the following, depending on whether you have set Nvidia as the default runtime:

Runtimes: nvidia runc
Default Runtime: runc

3. Docker Compose file setup

First, make sure docker compose file format is at least '2.3' or higher to support specifying a runtime
Following runtime and environment variables should be added to any image that will be accessing CUDA functionalities:

runtime: nvidia
environment:
    - NVIDIA_VISIBLE_DEVICES=all
    - NVIDIA_DRIVER_CAPABILITIES=all

A snippet of how docker compose file should look like after adding above:

Running GPU docker compose

You should now be able to start docker compose as usual, like docker-compose up or similar.

Additional notes and troubleshooting

Restarting docker daemon might be necessary if errors occur:

sudo systemctl daemon-reload
sudo systemctl restart docker

Make sure docker container has access to NVIDIA drivers:

Connect to the container in an interactive mode: docker exec -it <containe name> sh
Run nvidia-smi, you should see GPU name, driver version, etc.

For more information and updates from us visit AVA.info

Follow us on LinkedIn & Twitter