Skip to content

A Dockerised JupyterHub environment for Deep Learning with GPUs

License

Notifications You must be signed in to change notification settings

biocomplab/dl-hub

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

97 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

dl-hub

A Dockerised JupyterHub environment for Deep Learning with GPUs

JupyterHub is a customisable, flexible, scalable, portable system for bringing Jupyter notebooks (labs) to groups of users. It gives users access to computing resources (including GPUs!) through a browser without them needing to install, configure or maintain the computing environment.

JupyterHub Schematic
JupyterHub schematic from the official documentation.

This repository builds a hub which spawns isolated, dockerised JupyterLab environments with mounted GPUs for deep learning acceleration. The containers are spawned from images based on the Jupyter Docker Stacks but built using an NVIDIA CUDA base image. Note that GPUs are currently shared between all spawned JupyterLab environments although it may be possible to allocate them in a round-robin system (PRs accepted!).

Setup

These instructions assume you are using the latest Ubuntu LTS on your server. To install and setup the required packages, execute these commands:

Install NVIDIA drivers

First update the system and blacklist the nouveau drivers.

# Update the installed packages
sudo apt-get update && sudo apt-get upgrade && sudo apt-get install curl

# Blacklist noveau
sudo bash -c "echo blacklist nouveau > /etc/modprobe.d/blacklist-nvidia-nouveau.conf"
sudo bash -c "echo options nouveau modeset=0 >> /etc/modprobe.d/blacklist-nvidia-nouveau.conf"

# If the `nouveau` driver was already in use, it is necessary to rebuild the kernel and reboot
# Regenerate initramfs
sudo update-initramfs -u

# Reboot
sudo reboot

Check the installed Graphics card with: sudo lshw -C display.

# Versions default to the last (tested working) versions
# Search here: https://www.nvidia.com/Download/index.aspx?lang=en-uk
export NVIDIA_DRIVER_VERSION=545.29.06  # ${1:-460.39}

# [Optional] Stop X-server if a GUI is installed
# sudo service lightdm stop  # Assuming a lightdm desktop. Alternative: gdm | kdm
# sudo init 3  # This may also be necessary

# Install NVIDIA drivers
sudo apt-get install build-essential gcc-multilib dkms
curl -o nvidia-drivers-$NVIDIA_DRIVER_VERSION.run https://uk.download.nvidia.com/XFree86/Linux-x86_64/$NVIDIA_DRIVER_VERSION/NVIDIA-Linux-x86_64-$NVIDIA_DRIVER_VERSION.run
chmod +x nvidia-drivers-$NVIDIA_DRIVER_VERSION.run
sudo ./nvidia-drivers-$NVIDIA_DRIVER_VERSION.run --dkms --no-opengl-files
# run nvidia-xconfig: Y

# Verify installation
nvidia-smi
# read -p "Press any key to reboot..." -n1 -s
sudo reboot  # Alternative: sudo service lightdm start

Install Docker Engine, Docker Compose and NVIDIA Container Toolkit

# Install Docker Engine and Docker Compose plugin
# https://docs.docker.com/engine/install/ubuntu/
# https://docs.docker.com/compose/install/linux/
sudo apt-get install \
    ca-certificates \
    curl \
    gnupg
    # apt-transport-https \
    # gnupg-agent \
    # software-properties-common

sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

echo \
  "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  "$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt-get update && sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

# Verify installation
sudo docker run hello-world

# Install NVIDIA Container Toolkit
# https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
      && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
      && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
            sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
            sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit

sudo nvidia-ctk runtime configure --runtime=docker
# Alternatively use containerd or crio as the runtime

sudo systemctl restart docker
Alternatively install the deprecated `nvidia-docker2` package for Kubernetes [click to expand]

# NOTE: nvidia-docker2 is still required for Kubernetes but otherwise only nvidia-container-toolkit
# https://docs.nvidia.com/datacenter/cloud-native/kubernetes/install-k8s.html
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
   && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
   && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

# NOTE: I had to manually edit /etc/apt/sources.list.d/nvidia-docker.list to change 18.04 to 20.04
# Install nvidia-docker2 to provide the legacy runtime=nvidia for use with docker-compose (see: https://github.com/NVIDIA/nvidia-docker/issues/1268#issuecomment-632692949)
sudo apt-get update && sudo apt-get install -y nvidia-docker2
# sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

# Verify installation
docker run --rm --gpus all nvidia/cuda:12.2.2-cudnn8-devel-ubuntu22.04 nvidia-smi

Post Installation Steps

# Add users to the `docker` group to let them use docker on the server without `sudo`
# https://docs.docker.com/engine/install/linux-postinstall/
sudo groupadd docker
sudo usermod -aG docker $USER

# Activate changes
newgrp docker

# Verify
docker run hello-world

# Configure Docker to start on boot with systemd
sudo systemctl enable docker.service
sudo systemctl enable containerd.service

Create .env file for sensitive configuration details

In addition to these files, create a .env file with the necessary secrets set as variables e.g.:

COMPOSE_PROJECT_NAME=dl_hub
AUTH_SERVER_ADDRESS=authenticator.uni.ac.uk
ADMIN_USERS='user1 user2 user3'  # A string of user names separated by spaces
# DOCKER_NETWORK_NAME=${COMPOSE_PROJECT_NAME}_default

See here for documentation on setting and passing environment variables to docker compose.

Authentication

Depending on your environment, you will probably want to configure a more sophisticated authenticator e.g. the PAMAuthenticator or ldapauthenticator. You will need configuration details from the university system administrators for this in order to use the existing user authentication systems. These details should be configured in jupyterhub/jupyterhub_config.py (with secrets in .env as necessary).

Your organisation may also be able to issue and sign SSL certificates for the server. This repository currently assumes they are in jupyterhub/cert/. Appropriate configuration settings then need to be set in jupyterhub/jupyterhub_config.py e.g.:

# Configure SSL
c.JupyterHub.ssl_key = '/srv/jupyterhub/hub.key'
c.JupyterHub.ssl_cert = '/srv/jupyterhub/chain.crt'
c.JupyterHub.port = 443

# Configure configurable-http-proxy to redirect http to https
c.ConfigurableHTTPProxy.command = ['configurable-http-proxy', '--redirect-port', '80']

The corresponding lines where the certificates are installed in jupyterhub/Dockerfile will also need to be edited.

Optional additional steps

  • Customise JupyterHub
    • Edit jupyterhub_config.py
  • Automatically add new users to the docker group to let them use docker on the server without sudo
    • sudo nano /etc/adduser.conf then add the following lines
      • EXTRA_GROUPS="docker" # Separate groups with spaces e.g. "docker users"
      • ADD_EXTRA_GROUPS=1
  • Mount additional partitions
  • Move Docker disk to separate partition
    • sudo systemctl stop docker
    • Copy or move the data e.g.: sudo rsync -aP /var/lib/docker/ /path/to/your/docker_data
    • Edit /etc/docker/daemon.json to add "data-root": "/path/to/your/docker_data"
    • sudo systemctl start docker
  • Set up build target of jupyter/docker-stacks with --build-arg
  • Install extras, e.g.:
    • screen
    • tmux
    • htop
    • nvtop
  • Create a list or dictionary of allowed images which will be presented as a dropdown list of options for users at logon e.g.:
    • c.DockerSpawner.allowed_images = {"Latest": "cuda-dl-lab:11.4.2-cudnn8", "Previous": "cuda-dl-lab:11.2.2-cudnn8"}
    • c.DockerSpawner.allowed_images = ["cuda-dl-lab:11.4.2-cudnn8", "cuda-dl-lab:11.2.2-cudnn8"]
  • Schedule a backup!

Updating

# sudo service lightdm stop  # or gdm or kdm depending on your display manager
curl -o nvidia-drivers.run https://uk.download.nvidia.com/XFree86/Linux-x86_64/$NVIDIA_DRIVER_VERSION/NVIDIA-Linux-x86_64-$NVIDIA_DRIVER_VERSION.run
chmod +x nvidia-drivers-$NVIDIA_DRIVER_VERSION.run
sudo ./nvidia-drivers-$NVIDIA_DRIVER_VERSION.run --dkms --no-opengl-files
nvidia-smi
sudo reboot
  • Confirm the drivers work: docker run --rm --gpus all nvidia/cuda:12.2.2-cudnn8-devel-ubuntu22.04 nvidia-smi

Docker, Docker Compose and nvidia-container-toolkit

  • sudo apt update && sudo apt upgrade
  • Update JUPYTERHUB_VERSION=4.0.2 in:

    • docker-compose.yml
    • jupyterhub/Dockerfile (optional)
  • Edit jupyterhub/jupyterhub_config.py for any additional volumes

Restart the Hub

  • make stop (in case the hub is running)
  • make hub

About

A Dockerised JupyterHub environment for Deep Learning with GPUs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •