diff --git a/docs/12-self-hosting/01-getting-started.mdx b/docs/12-self-hosting/01-getting-started.mdx new file mode 100644 index 00000000..c722e9f0 --- /dev/null +++ b/docs/12-self-hosting/01-getting-started.mdx @@ -0,0 +1,83 @@ +--- +toc_max_heading_level: 4 +--- + +# Self-Hosting + +Projects often encounter a constraint or requirement which make free-tier hosted CI/CD instances +insufficient for their needs. In these cases hosting your own CI/CD runner can be a viable solution +to premium-tier services or subscriptions. Self-hosting may also provide access to resources that +are simply not available on many CI/CD services such as GPUs, faster drives, and newer CPU models. + +There are many ways to self-host CI/CD runners, and which one is best for you will depend on your on +situation and constraints. For the purpose of this guide we will make the following assumptions: + +- User already has their own hardware 💻 +- Budget is $0 💸 +- FOSS tools should be prioritized where possible 🛠️ +- We define `Self-Hosting` in this context to refer to a user taking responsibilty for the + operating-system level configuration and life-cycle-management of a given compute resource (metal, + on-prem, cloud VM, VPS etc...) 📜 + +## Requirements + +This guide is tested for compatibility on devics which meet the following requirements: + +- x86 or amd64 processor +- Ubuntu 22.04 LTS Server or Debian 12 Bookworm +- root access +- Network connectivity +- Nvidia GPU (required for GPU Acceleration) + +## Finding a Host + +The "Host" is the computer which will execute the runner program. This can be a desktop computer, +laptop, Virtual Machine, or VPS from a cloud provider. If your host is cloud-instance then the OS +and admin user should already exist. Move on to the provisioning section for further instructions. +If your host is a local machine, perform a clean installation of the operating system using the +following guides: + +Ubuntu: + +- Download the Ubuntu 22.04 LTS server installer + [HERE](https://ftp.snt.utwente.nl/pub/os/linux/ubuntu-releases/22.04.3/ubuntu-22.04.3-live-server-amd64.iso) +- [Install Ubuntu 22.04 LTS on a local machine](https://ostechnix.com/install-ubuntu-server/) + +Debian: + +- Download the Debain 12 installer + [HERE](https://cdimage.debian.org/debian-cd/current/amd64/iso-dvd/debian-12.1.0-amd64-DVD-1.iso) +- [Install Debian on a local system](https://www.linuxtechi.com/how-to-install-debian-11-bullseye/) + +## Provisioning the Host + +Once you can sign into your host, follow the appropriate guide to complete the provisioning process. +Note that the Nvidia driver and container-toolkit installation are only required for GPU +acceleration. Skip those steps if you do require that functionality. + +- [Ubuntu 22.04 Setup](./ubuntu-setup) + +- [Debian 12 Setup](./debian-setup) + +## Splitting the Host + +You may find yourself with the requirement for multiple runners, runners of differing +configurations, or different operating systems but only have access to a single Host. In this case, +you can split your Host into multiple runners using virtualization. If this applies to you, complete +the provisioning step and then follow the [Virtual Machines Guide](./virtual-machines). + +## Installing the Runner Software + +- [Github Actions](./github-actions) + +- [GitLab Pipelines](./gitlab-pipelines) + +- CircleCI (ToDo) + +- Argo Workflows (ToDo) + +- Jenkins (ToDo) + +- Ansible/Semaphore/AWX (ToDo) + +## Update Workflows diff --git a/docs/12-self-hosting/02-debian-setup.mdx b/docs/12-self-hosting/02-debian-setup.mdx new file mode 100644 index 00000000..3502c31b --- /dev/null +++ b/docs/12-self-hosting/02-debian-setup.mdx @@ -0,0 +1,309 @@ +--- +toc_max_heading_level: 4 +--- + +# Debian Machine Setup + +Steps for manual configuration and provisioning of Debian 12 server systems. These steps will also +upgrade a Debain 11 system to Debian 12. This guide assumes and recommends that the user is starting +from a fresh installation. If you unfamiliar with the installation process for Debian, see the links +below before progressing. + +- [How to Install Debian](https://www.linuxtechi.com/how-to-install-debian-11-bullseye/) + +- [Debian 12 ISO Image](https://cdimage.debian.org/cdimage/weekly-builds/amd64/iso-dvd/debian-testing-amd64-DVD-1.iso) + +## Base Packages + +- Add required the apt sources and upgrade your system to the latest version. + + ```bash + # Run as root + cat << EOF > /etc/apt/sources.list + deb http://deb.debian.org/debian bookworm main contrib non-free non-free-firmware + deb-src http://deb.debian.org/debian bookworm main contrib non-free non-free-firmware + + deb http://deb.debian.org/debian-security/ bookworm-security main contrib non-free + deb-src http://deb.debian.org/debian-security/ bookworm-security main contrib non-free + + deb http://deb.debian.org/debian bookworm-updates main contrib non-free + deb-src http://deb.debian.org/debian bookworm-updates main contrib non-free + EOF + ``` + +- Apply system updates and upgrades + + Update the package list and upgrade system components prior to installing other software. Reboot + after the process completes. + + ```bash + # run as root + apt-get update && \ + apt-get upgrade -y && \ + apt-get full-upgrade -y + + reboot + ``` + +- Install base utilities + + Installing a small set of base packages that are dependancies for steps later in the guide and + helpful in general. + + ```bash + # Run as root + apt-get update && \ + apt-get install -y wireguard \ + openresolv \ + ssh-import-id \ + sudo \ + curl \ + tmux \ + netplan.io \ + apt-transport-https \ + ca-certificates \ + software-properties-common \ + htop \ + git-extras \ + rsyslog \ + fail2ban \ + vim \ + gpg \ + open-iscsi \ + nfs-common \ + ncdu \ + iotop && \ + sudo wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/bin/yq && \ + sudo chmod +x /usr/bin/yq && \ + sudo systemctl enable fail2ban && \ + sudo systemctl start fail2ban + ``` + +- Install Prometheus metrics exporter (Optional) + + ```bash + # Run as Root + wget -O /opt/node_exporter-1.6.1.linux-amd64.tar.gz https://github.com/prometheus/node_exporter/releases/download/v1.6.1/node_exporter-1.6.1.linux-amd64.tar.gz && \ + tar -xvf /opt/node_exporter-1.6.1.linux-amd64.tar.gz -C /opt && \ + rm /opt/node_exporter-1.6.1.linux-amd64.tar.gz && \ + ln -s node_exporter-1.6.1.linux-amd64 /opt/node_exporter + + wget https://raw.githubusercontent.com/small-hack/smol-metal/main/node-exporter.service && \ + sudo mv node-exporter.service /etc/systemd/system/node-exporter.service && \ + systemctl daemon-reload && \ + systemctl enable node-exporter && \ + systemctl restart node-exporter + ``` + +## Setup the admin user + +- Create the user + + ```bash + # Run as root + export NEW_USER="" + useradd -s /bin/bash -d /home/$NEW_USER/ -m -G sudo $NEW_USER + ``` + +- Grant passwordless sudo permission + + ```bash + # Run as root + echo "$NEW_USER ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers + ``` + +- Import an ssh key + + ```bash + # Run as root + export GITHUB_USER="your-github-username" + sudo -u friend ssh-import-id-gh $GITHUB_USER + ``` + +- Add the user to relevant groups + + ```bash + # Run as root + usermod -a -G kvm $NEW_USER + usermod -a -G docker $NEW_USER + ``` + +- Craete a password for the user + + ```bash + # Run as root + passwd $NEW_USER + ``` + +## Install Docker + +- Download the docker gpg key + + ```bash + curl -fsSL https://download.docker.com/linux/ubuntu/gpg | \ + sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg + ``` + +- Add the apt package source + + ```bash + echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/debian $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null + ``` + +- Update the apt package list and install Docker + + ```bash + sudo apt-get update && sudo apt-get install -y docker-ce + ``` + +## Install Docker Compose + +The default apt package lists provide a very outdated version of docker compose. Below are the steps +required to install a current version. + +- Find the latest version by visiting https://github.com/docker/compose/releases + + ```bash + export COMPOSE_VERSION="2.17.3/" + ``` + +- Create a directory for the binary + + ```bash + mkdir -p ~/.docker/cli-plugins/ + ``` + +- Download the binary + + ```bash + curl -SL https://github.com/docker/compose/releases/download/v$COMPOSE_VERSION/docker-compose-linux-x86_64 -o ~/.docker/cli-plugins/docker-compose + ``` + +- Make it executable + + ```bash + chmod +x ~/.docker/cli-plugins/docker-compose + ``` + +## NVIDIA GPU Drivers + +Debain's built-in driver installation process is more reliable than Ubuntu's, but readers are still +advised to get their driver installaer directly from NVIDIA. Instuctions for apt installation are +included for completeness. + +- Install required packages + + ```bash + # Run as root + apt-get install -y gcc \ + firmware-misc-nonfree \ + linux-headers-amd64 \ + linux-headers-`uname -r` + ``` + +- Find your driver version and download the installer + + - Use Nvidia's web tool located here: https://www.nvidia.com/download/index.aspx. + + - Alternatively, you can download a specific driver version using curl + + ```bash + # run as root + export DRIVER_VERSION="" + + # GeForce Cards + curl --progress-bar -fL -O "https://us.download.nvidia.com/XFree86/Linux-x86_64/$DRIVER_VERSION/NVIDIA-Linux-x86_64-$DRIVER_VERSION.run" + + # Datacenter Cards + curl --progress-bar -fL -O "https://us.download.nvidia.com/tesla/$DRIVER_VERSION/NVIDIA-Linux-x86_64-$DRIVER_VERSION.run" + ``` + + - Run the Installer + + ```bash + # It is required to run the driver from the system console, it cannot be installed from an X-session. + + bash "NVIDIA-Linux-x86_64-*.run" + ``` + +- Install from Apt + + ```bash + sudo apt-get install nvidia-driver + ``` + +## Nvidia Container Toolkit + +The nvidia container toolkit allows containes to access the GPU resources of the underlying host +machine. Requires that the GPU drivers are already installed on the host. See the official docs +here: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html + +- Set your system distribution name to debian11 as workaround until nvidia adds official debian12 + support + + ```bash + distribution=debian11 + ``` + +- Download the gpg key and add the repo to your apt sources + + ```bash + curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \ + && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \ + sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ + sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list + ``` + +- Update apt packages and install the container toolkit + + ```bash + sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit + ``` + +- Set `nvidia` the default container runtime + + ```bash + sudo nvidia-ctk runtime configure --runtime=docker --set-as-default + ``` + +- Restart the docker service + + ```bash + sudo systemctl restart docker + ``` + +- Test that it is working + + ```bash + sudo docker run --rm --runtime=nvidia --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi + ``` + + > Successful output: + > + > ```console + > sudo docker run --rm --runtime=nvidia --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi + > Sat Sep 9 19:18:45 2023 + > +---------------------------------------------------------------------------------------+ + > | NVIDIA-SMI 535.54.06 Driver Version: 535.54.06 CUDA Version: 11.6 | + > |-----------------------------------------+----------------------+----------------------+ + > | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | + > | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | + > | | | MIG M. | + > |=========================================+======================+======================| + > | 0 Tesla M40 24GB On | 00000000:04:00.0 Off | 0 | + > | N/A 36C P8 28W / 250W | 23MiB / 23040MiB | 0% Default | + > | | | N/A | + > +-----------------------------------------+----------------------+----------------------+ + > + > +---------------------------------------------------------------------------------------+ + > | Processes: | + > | GPU GI CI PID Type Process name GPU Memory | + > | ID ID Usage | + > |=======================================================================================| + > | No running processes found | + > +---------------------------------------------------------------------------------------+ + > ``` + + ``` + + ``` diff --git a/docs/12-self-hosting/02-ubuntu-setup.mdx b/docs/12-self-hosting/02-ubuntu-setup.mdx new file mode 100644 index 00000000..7f8ed430 --- /dev/null +++ b/docs/12-self-hosting/02-ubuntu-setup.mdx @@ -0,0 +1,278 @@ +--- +toc_max_heading_level: 4 +--- + +# Ubuntu Machine Setup + +Steps for manual configuration and provisioning of Ubuntu 22.04 server systems. This guide assumes +and recommends that the user is starting from a fresh installation. If you unfamiliar with the +installation process for Ubuntu, see the link below before progressing. + +- [How to Install Ubuntu 22.04 LTS Server Edition](https://ostechnix.com/install-ubuntu-server/) + +## Base Setup + +- Apply system updates and upgrades + + Update the package list and upgrade system components prior to installing other software. Often + this process results in updates that will require a system reboot. + + ```bash + sudo apt-get update && \ + sudo apt-get upgrade -y + + sudo reboot + ``` + +- Install base utilities + + Installing a small set of base packages that are dependancies for steps later in the guide and + helpful in general. + + ```bash + apt-get update && \ + apt-get install -y wireguard \ + openresolv \ + ssh-import-id \ + sudo \ + curl \ + tmux \ + netplan.io \ + apt-transport-https \ + ca-certificates \ + software-properties-common \ + htop \ + git-extras \ + rsyslog \ + fail2ban \ + vim \ + gpg \ + open-iscsi \ + nfs-common \ + ncdu \ + iotop && \ + sudo wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/bin/yq && \ + sudo chmod +x /usr/bin/yq && \ + sudo systemctl enable fail2ban && \ + sudo systemctl start fail2ban + ``` + +- Install Docker + + - Download the docker gpg key + + ```bash + curl -fsSL https://download.docker.com/linux/ubuntu/gpg | \ + sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg + ``` + + - Add the apt package source + + ```bash + echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null + ``` + + - Update the apt package list and install Docker + + ```bash + sudo apt-get update && sudo apt-get install -y docker-ce + ``` + +- Setup the admin user + + - Create the user + + ```bash + export NEW_USER="" + sudo useradd -s /bin/bash -d /home/$NEW_USER/ -m -G sudo $NEW_USER + ``` + + - Grant passwordless sudo permission + + ```bash + sudo echo "$NEW_USER ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers + ``` + + - Import an ssh key + + ```bash + sudo -u friend ssh-import-id-gh cloudymax + ``` + + - Add the user to relevant groups + + ```bash + sudo usermod -a -G kvm $NEW_USER + sudo usermod -a -G docker $NEW_USER + ``` + + - Craete a password for the user + + ```bash + sudo passwd $NEW_USER + ``` + +## Docker Compose + +The default Ubuntu package lists provide a very outdated version of docker compose. Below are the +steps required to install a current version. + +- Find the latest version by visiting https://github.com/docker/compose/releases + + ```bash + export COMPOSE_VERSION="2.17.3/" + ``` + +- Create a directory for the binary + + ```bash + mkdir -p ~/.docker/cli-plugins/ + ``` + +- Download the binary + + ```bash + curl -SL https://github.com/docker/compose/releases/download/v$COMPOSE_VERSION/docker-compose-linux-x86_64 -o ~/.docker/cli-plugins/docker-compose + ``` + +- Make it executable + + ```bash + chmod +x ~/.docker/cli-plugins/docker-compose + ``` + +## NVIDIA GPU Drivers + +Ubuntu's built-in driver installation tool is really unreliable. Instructions for its use are +included, but readers are advised to get their driver installaer directly from NVIDIA + +### Install from Nvidia + +- Download and install driver dependancies + + ```bash + sudo apt-get install -y ubuntu-drivers-common \ + linux-headers-generic \ + gcc \ + kmod \ + make \ + pkg-config \ + libvulkan1 + ``` + +- Find your driver version and download the installer + + - Use Nvidia's web tool located here: https://www.nvidia.com/download/index.aspx. + + - Alternatively, you can download a specific driver version using curl + + ```bash + export DRIVER_VERSION="" + + # GeForce Cards + curl --progress-bar -fL -O "https://us.download.nvidia.com/XFree86/Linux-x86_64/$DRIVER_VERSION/NVIDIA-Linux-x86_64-$DRIVER_VERSION.run" + + # Datacenter Cards + curl --progress-bar -fL -O "https://us.download.nvidia.com/tesla/$DRIVER_VERSION/NVIDIA-Linux-x86_64-$DRIVER_VERSION.run" + ``` + +- Run the Installer + + - It is required to run the driver from the system console, it cannot be installed from an + X-session. + + ```bash + sudo bash "NVIDIA-Linux-x86_64-*.run" + ``` + +### Install from apt + +Automatic install is currently broken for multiple card-types, see +[#1993019](https://bugs.launchpad.net/ubuntu/+source/ubuntu-drivers-common/+bug/1993019). + +- Automatic install (Broken) + + ```bash + sudo ubuntu-drivers autoinstall + ``` + +- Install specific driver version + + ```bash + sudo ubuntu-drivers install nvidia:525 + ``` + +## Nvidia Container Toolkit + +The nvidia container toolkit allows containes to access the GPU resources of the underlying host +machine. Requires that the GPU drivers are already installed on the host. See the official docs +here: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html + +- Get your system distribution name + + ```bash + distribution=$(. /etc/os-release;echo $ID$VERSION_ID) + ``` + +- Download the gpg key and add the repo to your apt sources + + ```bash + curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \ + && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \ + sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ + sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list + ``` + +- Update apt packages and install the container toolkit + + ```bash + sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit + ``` + +- Set `nvidia` the default container runtime + + ```bash + sudo nvidia-ctk runtime configure --runtime=docker --set-as-default + ``` + +- Restart the docker service + + ```bash + sudo systemctl restart docker + ``` + +- Test that it is working + + ```bash + sudo docker run --rm --runtime=nvidia --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi + ``` + + > Successful output: + > + > ```console + > sudo docker run --rm --runtime=nvidia --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi + > Sat Sep 9 19:18:45 2023 + > +---------------------------------------------------------------------------------------+ + > | NVIDIA-SMI 535.54.06 Driver Version: 535.54.06 CUDA Version: 11.6 | + > |-----------------------------------------+----------------------+----------------------+ + > | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | + > | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | + > | | | MIG M. | + > |=========================================+======================+======================| + > | 0 Tesla M40 24GB On | 00000000:04:00.0 Off | 0 | + > | N/A 36C P8 28W / 250W | 23MiB / 23040MiB | 0% Default | + > | | | N/A | + > +-----------------------------------------+----------------------+----------------------+ + > + > +---------------------------------------------------------------------------------------+ + > | Processes: | + > | GPU GI CI PID Type Process name GPU Memory | + > | ID ID Usage | + > |=======================================================================================| + > | No running processes found | + > +---------------------------------------------------------------------------------------+ + > ``` + + ``` + + ``` diff --git a/docs/12-self-hosting/03-github-actions.mdx b/docs/12-self-hosting/03-github-actions.mdx new file mode 100644 index 00000000..e4e2e423 --- /dev/null +++ b/docs/12-self-hosting/03-github-actions.mdx @@ -0,0 +1,9 @@ +--- +toc_max_heading_level: 4 +--- + +# Github Actions + +## Manual + +## Automated diff --git a/docs/12-self-hosting/03-gitlab-pipelines.mdx b/docs/12-self-hosting/03-gitlab-pipelines.mdx new file mode 100644 index 00000000..51570e4f --- /dev/null +++ b/docs/12-self-hosting/03-gitlab-pipelines.mdx @@ -0,0 +1,9 @@ +--- +toc_max_heading_level: 4 +--- + +# Gitlab Pipelines + +## Manual + +## Automated diff --git a/docs/12-self-hosting/03-virtual-machines.mdx b/docs/12-self-hosting/03-virtual-machines.mdx new file mode 100644 index 00000000..7eb471ed --- /dev/null +++ b/docs/12-self-hosting/03-virtual-machines.mdx @@ -0,0 +1,9 @@ +--- +toc_max_heading_level: 4 +--- + +# Creating a Virtual Machine + +## Multipass + +## QEMU/KVM diff --git a/docs/12-self-hosting/_category_.yaml b/docs/12-self-hosting/_category_.yaml new file mode 100644 index 00000000..6a7cd1fc --- /dev/null +++ b/docs/12-self-hosting/_category_.yaml @@ -0,0 +1,5 @@ +--- +position: 6.0 +label: Self-Hosting +collapsible: true +collapsed: true