From 1a068b42554d087ceca281fa64d448e719040af5 Mon Sep 17 00:00:00 2001
From: Udo Schlegel <u.schlegel@uni-konstanz.de>
Date: Fri, 25 Oct 2019 23:06:46 +0900
Subject: [PATCH] Update, Handout

updated readme and pytorch,
added handout
---
 Dockerfile | 37 ++++++++++++++++++++++++---
 README.md  | 75 +++++++++++++++++++++++++++++++++++++++---------------
 2 files changed, 88 insertions(+), 24 deletions(-)

diff --git a/Dockerfile b/Dockerfile
index 6ecab5f..231b610 100755
--- a/Dockerfile
+++ b/Dockerfile
@@ -36,12 +36,15 @@ RUN apt-get update && apt-get install -y \
 RUN R -e "install.packages(c('crayon', 'pbdZMQ', 'devtools', 'IRdisplay'), repos='http://cran.us.r-project.org')"
 RUN R -e "devtools::install_github(paste0('IRkernel/', c('repr', 'IRdisplay', 'IRkernel')))"
 
+RUN pip install --upgrade pip
+
 # Install virtualenv to let users install more libs
 RUN pip install -U virtualenv
 
+RUN pip install -U cython
+
 # Install KERAS + SCIKIT + Data Science Libs
 RUN pip install -U \
-		cython \
 		scipy \
 		numpy \
 		keras \
@@ -59,7 +62,10 @@ RUN pip install -U \
 		catboost \
 		opencv-python \
 		tqdm \
-		tslearn
+		tslearn \
+        bert-serving-server \
+        bert-serving-client \
+        handout
 
 # Install OpenCV + HDF5
 RUN apt-get update && apt-get install -y \
@@ -80,8 +86,7 @@ RUN apt-get update && apt-get install -y \
 	rm -rf /var/lib/apt/lists/*
 
 # Install PYTORCH
-RUN pip install https://download.pytorch.org/whl/cu100/torch-1.0.1.post2-cp35-cp35m-linux_x86_64.whl && \
-    pip install torchvision
+RUN pip install torch===1.3.0 torchvision===0.4.1 -f https://download.pytorch.org/whl/torch_stable.html
 
 # TensorBoard
 EXPOSE 6006
@@ -123,9 +128,33 @@ RUN jupyter contrib nbextension install --user && \
 
 USER $NB_USER
 
+# add alias to include system site packages into virtualenvs
+RUN echo "alias virtualenv='virtualenv --system-site-packages'" >> ~/.bashrc
+
 # Install and enable nbextensions
 RUN jupyter contrib nbextension install --user && \
     jupyter nbextensions_configurator enable --user && \
     jupyter tensorboard enable --user && \
     jupyter nbextension enable --py nbzip && \
     jupyter nbextension enable --py rise
+
+
+# Install anaconda
+
+## TODO fix this
+
+#USER root
+
+#RUN curl -O https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh
+#RUN bash Anaconda3-2019.03-Linux-x86_64.sh -b -p "/home/$NB_USER/anaconda3"
+
+#RUN chown -R $NB_USER:dbvis /home/$NB_USER/anaconda3
+
+#USER $NB_USER
+
+#RUN echo "export PATH=~/anaconda3/bin:$PATH" >> ~/.bashrc
+#RUN /bin/bash -c "source ~/.bashrc"
+
+#RUN export PATH=~/anaconda3/bin:$PATH
+
+#RUN /bin/bash -c "conda install -c anaconda cython"
diff --git a/README.md b/README.md
index 3831ef3..dd0757a 100755
--- a/README.md
+++ b/README.md
@@ -1,6 +1,6 @@
 # Start
 
-Hi User,
+Dear user,
 
 this ReadMe should prepare you to work with the DBVIS GPU Server.  
 It is conceived as a guide to using every possibility of the server.
@@ -13,55 +13,90 @@ There are two major options to use the server:
 
 ## Jupyter Notebooks
 
-Jupyter Notebooks are built as a running Python or R terminal emulation.
-It enables an easier access to build small programs faster and more efficient.
+Jupyter Notebooks are built as a Python or R terminal emulation.
+It enables easier access to build small programs faster and more efficient.
 You can create a new one by pressing the New button and select a new Python or R kernel.
-It is advised to use them as you can test your Python or R code more easily with them and if your development stage is over you can export them as Python or R scripts.
+It is advised to use them as you can test your Python or R code more easily with them and if your development stage is over, you can export them as Python or R scripts.
 
 ## Terminal Access
 
 Another option is the built-in terminal and editor of Jupyterhub.
-You can open nearly every text file with the editor provided by Jupyterhub. This works like a normal desktop environment by either clicking on the file or by creating a new one.
+You can open nearly every text file with the editor provided by Jupyterhub. This works like a typical desktop environment by either clicking on the file or by creating a new one.
 Opening a terminal works with New button and then the terminal button.
 Afterward, you can navigate to your script and run it with the command line.
+Default the terminal is started with shell, but with `bash` you can change that.
 
 # Advises
 
-## Tensorflow 2.0.0 alpha
+## Virtual Environment
 
-If you are using Tensorflow or a framework, which runs Tensorflow in the background, it is advised to use:
-```
-import tensorflow as tf
-from tensorflow.keras import backend as K
+The server is equipped with the many python frameworks and libraries. In case you need additional libraries you can create a virtual environment to add them and start Jupyter notebook via the new virtual environment. Use the following snippet to create a new virtual environment:
+
+1. Open a terminal:  
+  ```
+  ## CREATE A NEW VIRTUAL ENVIRONMENT ##
+  bash         # start a bash
+  cd           # navigate to the home directory
+  name='venv'  # specify the name of your virtual environment
+  virtualenv --system-site-packages $name  # create the virtual environment + add all already installed packages
+  source $name/bin/activate  # activate the virtual environment
+  pip install --upgrade pip  # upgrade pip
+  ipython kernel install --user --name=$name   # install in user space
+  sed -i -e 's|/usr/bin/python3|'${HOME}'/'$name'/bin/python|g' $HOME'/.local/share/jupyter/kernels/'$name'/kernel.json' # replace python binary of the new kernel with the kernel of the new virtual environment
+  ```
+2. Via the following snippets you can add new packages to the virtual environment:  
+  ```
+  ## ADD PACKAGES TO A VIRTUAL ENVIRONMENT ##
+  bash
+  cd
+  name='venv'
+  source $name/bin/activate
+  pip install <package>
+  ```
+
+3. Now you can create new Jupyter notebooks using the virtual environment from the *New* tab in the jupyter home directory.
+
+## GPU Status
+
+You can see the available GPUs as well as the available VRAM by typing `nvidia-smi` in a terminal.
+
+## GPU Usage
+
+Please make sure to just use **only one** GPU at the same time. After checking for an available GPU, you can specify the GPU inside your Jupyter notebooks using the following snippet:
 
-config = tf.compat.v1.ConfigProto()
-config.gpu_options.allow_growth = True
-K.set_session(tf.compat.v1.Session(config=config))
 ```
-To limit the amount of GPU memory Tensorflow uses.
-Else Tensorflow will use all available GPU memory, which blocks other users and leads to your process being killed.
+import os
+os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"   # order GPUs by a steady ordering (like nvidia-smi)
+os.environ["CUDA_VISIBLE_DEVICES"]="0"         # the ID of the GPU (view via nvidia-smi)
+```
+
+**After finishing using the GPUs, please explicitly stop your Juypter notebooks, since otherwise the VRAM occupied might not be freed.**
 
 ## Tensorflow
 
-If you are using Tensorflow or a framework, which runs Tensorflow in the background, it is advised to use:
+If you are using Tensorflow or a framework (Keras), which runs Tensorflow in the background, it is advised to use:
 ```
 import tensorflow as tf
 config = tf.ConfigProto()
 config.gpu_options.allow_growth = True
 session = tf.Session(config=config, ...)
 ```
-or
+or Keras explicitly
 ```
 from keras import backend as K
 config = K.tf.ConfigProto()
 config.gpu_options.allow_growth = True
 K.set_session(K.tf.Session(config=config))
 ```
+
 To limit the amount of GPU memory Tensorflow uses.
 Else Tensorflow will use all available GPU memory, which blocks other users and leads to your process being killed.
 
 
-# Don'ts
+# Dont's
 
-- Do not use more space than you need. This means if your data is too large, use the database server to store it. There are enough options to use Python and a database server efficiently.
-- Do not block all of the available space of the different GPUs. Either block one GPU completely or try to use the allow growth option of tensorflow.
+- Do not use more space than you need. 
+This means if your data is too large, use the database server to store it. 
+There are enough options to use Python and a database server efficiently.
+- Do not block all of the available space of the different GPUs. 
+Either block one GPU entirely or try to use the allow growth option of tensorflow.
\ No newline at end of file