This is another variant of jupyterhub-deploy, which originally comes from https://github.com/jupyterhub/jupyterhub-deploy-docker. The overall architecture has been inspired by the original jupyterhub-deploy as well as a very informative article by Jessica Hamrick (@jhamrick). As in the original version, it uses make build
command to create the docker image and docker-compose up
to run the docker container. This version has been successfully deployed and run in 3 Ubuntu VMs (1 for the server and 2 for the nodes).
A Jupyterhub server that can spawn individual Jupyter Notebook containers in a cluster. This is to provide a framework for users of Cab-Lab to play around with the data cube.
- 3 Ubuntu VMs with docker and docker-compose installed
- Create GitHub application here to get the Client ID, Client Secret, and Authorization callback URL. Modify the .env file with these information.
- TLS certificates (self-signed or letsencrypt)
- Modify
.env
filecd jupyterhub-deploy
cp .env.example .env
- modify .env
- Modify
userlist
file to grant normal or admin access to GitHub account(s)cd jupyterhub-deploy
touch userlist
- modify userlist
Example :user1 admin user2 user3
- Copy the certificates to
secrets
foldercd jupyterhub-deploy
- mkdir secrets
cp <any directory>/jupyterhub.crt secrets/
cp <any directory>/jupyterhub.key secrets/
- Create a cookie secret, which is an encryption key, used to encrypt the browser cookies used for authentication.
cd jupyterhub-deploy
openssl rand -hex 32 > cookie_secret
- VM1 acts as a Jupyterhub server. Therefore, 2 components need to be set up: docker swarm manager and docker swarm consul. More information about those two can be found here.
- run the consul container
docker run -d -p 8500:8500 --name=consul progrium/consul -server -bootstrap
- run the swarm manager
docker run -d -p 4000:4000 swarm manage -H :4000 --replication --advertise [VM1 host]:4000 consul://[VM1 host]:8500
- run the consul container
- VM2 acts as a node1 as well as a second manager.
- run the swarm manager
docker run -d -p 4000:4000 swarm manage -H :4000 --replication --advertise [VM2 host]:4000 consul://[VM1 host]:8500
- run the swarm node
docker run -d swarm join --advertise=[VM2 host]:2375 consul://[VM1 host]:8500
- run the swarm manager
- VM3 acts as a node2.
- run the swarm node
docker run -d swarm join --advertise=[VM3 host]:2375 consul://[VM1 host]:8500
- run the swarm node
- Restart the docker daemon in each VM with additional arguments to allow it to be part of the swarm cluster
- VM1 :
nohup sudo docker daemon -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --cluster-advertise [VM1 host]:2375 --cluster-store consul://[VM1 host]:8500 &
- VM2 :
nohup sudo docker daemon -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --cluster-advertise [VM2 host]:2375 --cluster-store consul://[VM1 host]:8500 &
- VM3 :
nohup sudo docker daemon -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --cluster-advertise [VM3 host]:2375 --cluster-store consul://[VM1 host]:8500 &
- VM1 :
- Create an overlay docker network so that containers can communicate to each other.
In VM1 :docker -H tcp://0.0.0.0:4000 network create -d overlay swarmnet
Now swarmnet network should be available in all 3 VMs. Check using this commanddocker network ls
- Setup a NFS server in VM1 (for more information go to this page
vim /etc/exports
and add the following entries
/_[any path]_/jupyterhub-shared *(rw,sync,no_root_squash)
/_[any path]_/cablab-shared *(rw,sync,no_root_squash)
exportfs -r
- Mount jupyterhub-shared and cablab-shared in each node VM
mount _[VM1 host]_:/_[any path]_/jupyterhub-shared /var/lib/docker/volumes mount _[VM1 host]_:/_[any path]_/cablab-shared /_[any local path]_/cablab-shared
- Create cablab/singleuser docker image
cd /_[any local path]_/cablab-shared
docker build -t cablab/singleuser .
- Make sure that
.env
file contains the correct information (in case of any name customisations). cd jupyterhub-deploy
make build
docker-compose up -d
- Open a browser and go to
https://[VM1 host]
This instruction will assume 3 VMs are available for the set-up. VM1 will act as the Jupyterhub server, the docker swarm main manager, as well as the consul. VM2 will act as the first node and VM3 will act as the second node.
This pre-configuration steps are not necessarily for only CentOS deployment, but for all deployment in an environment where the storage driver is devicemapper. In this environment, when running docker daemon, by default it uses loop-lvm mode. This mode uses sparse files to build the thin pool used by image and container snapshots and it is not very efficient for extensive IO operations within the containers. Docker states that this mode is not suitable for production use and recommends direct-lvm mode instead. So here are the steps on how to configure a direct-lvm mode in CentOS VM. These have been tested in CentOS 7.2 with kernel 3.10.
- Create docker Volume Group
sudo yum install -y lvm2*
sudo pvcreate /dev/vdb
sudo vgcreate docker /dev/vdb
sudo lvcreate --wipesignatures y -n thinpool docker -l 95%VG
sudo lvcreate --wipesignatures y -n thinpoolmeta docker -l 1%VG
sudo lvconvert -y --zero n -c 512K --thinpool docker/thinpool --poolmetadata docker/thinpoolmeta
- To change the thinpool profile, modify /etc/lvm/profile/docker-thinpool.profile to
activation {
thin_pool_autoextend_threshold=80
thin_pool_autoextend_percent=20
}
- Activate the new profile
sudo lvchange --metadataprofile docker-thinpool docker/thinpool
- Install Docker version 1.11
sudo yum update
sudo tee /etc/yum.repos.d/docker.repo <<-'EOF'
[dockerrepo]
name=Docker Repository
baseurl=https://yum.dockerproject.org/repo/main/centos/7/
enabled=1
gpgcheck=1
gpgkey=https://yum.dockerproject.org/gpg
EOF
sudo yum install docker-engine-1.11.2
- Start docker daemon
nohup sudo docker daemon -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --cluster-advertise [VM1 host]:2375 --cluster-store consul://[VM1 host]:8500 -s devicemapper --storage-opt dm.thinpooldev=/dev/mapper/docker-thinpool --storage-opt dm.use_deferred_removal=true &
- Install Docker Compose
sudo -i
curl -L https://github.com/docker/compose/releases/download/1.8.0/docker-compose-`uname -s`-`uname -m` > /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose
- Set-up NFS server Assumptions: /data is the datacube directory, /container-data is the docker container directory, ~/cablab-shared is the directory for sample notebooks
cd ~
sudo mkdir cablab-shared
sudo mkdir /data
sudo mount -o defaults /dev/data/datacube /data (after attaching the datacube volume to this VM)
sudo vim /etc/exports
[USER_HOME]/cablab-shared *(rw,sync,no_root_squash)
/data *(rw,sync,no_root_squash)
/container-data *(rw,sync,no_root_squash)
- Start docker daemon
nohup sudo docker daemon -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --cluster-advertise [VM1 host]:2375 --cluster-store consul://[VM1 host]:8500 -s devicemapper --storage-opt dm.thinpooldev=/dev/mapper/docker-thinpool --storage-opt dm.use_deferred_removal=true &
- Mount the centralised docker volume directory.
sudo mount [VM1 host]:/container-data /var/lib/docker/volumes
- Start consul
docker run -d -p 8500:8500 --name=consul progrium/consul -server -bootstrap
- Start docker swarm manager
docker run -d -p 4000:4000 swarm manage -H :4000 --replication --advertise [VM1 host]:4000 consul://[VM1 host]:8500
- Start docker daemon
nohup sudo docker daemon -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --cluster-advertise [VM2 host]:2375 --cluster-store consul://[VM1 host]:8500 -s devicemapper --storage-opt dm.thinpooldev=/dev/mapper/docker-thinpool --storage-opt dm.use_deferred_removal=true &
- Mount centralised directories
sudo mount [VM1 host]:/container-data /var/lib/docker/volumes
sudo mount [VM1 host]:[USER_HOME]/cablab-shared [USER_HOME]/cablab-shared
sudo mount [VM1 host]:/data /data
- Start docker daemon
nohup sudo docker daemon -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --cluster-advertise [VM3 host]:2375 --cluster-store consul://[VM1 host]:8500 -s devicemapper --storage-opt dm.thinpooldev=/dev/mapper/docker-thinpool --storage-opt dm.use_deferred_removal=true &
- Mount centralised directories
sudo mount [VM1 host]:/container-data /var/lib/docker/volumes
sudo mount [VM1 host]:[USER_HOME]/cablab-shared [USER_HOME]/cablab-shared
sudo mount [VM1 host]:/data /data
- Run
certbot renew
as stated in here - Replace the existing certs in /secrets with the newly generated ones (the certx.pem and privkeyx.pem). certx.pem as jupyterhub.crt and privkeyx.pem as jupyterhub.key
docker rm -f jupyterhub && docker rmi -f jupyterhub && docker-compose up -d