From f0da0a8201c04983c20235d1f180fcf8eab9a05e Mon Sep 17 00:00:00 2001
From: Ruiyi Wang <ruiyi.pamela.wang@gmail.com>
Date: Thu, 26 Oct 2023 15:15:00 -0400
Subject: [PATCH] Add tutorial for ssh tunnel

---
 llm_deploy/README.md | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/llm_deploy/README.md b/llm_deploy/README.md
index bae71816..8e2a1b99 100644
--- a/llm_deploy/README.md
+++ b/llm_deploy/README.md
@@ -7,7 +7,6 @@ Go to the vllm dir and pip install -e .
 To notice https://github.com/vllm-project/vllm/issues/1283, need to modify the config file to "== 2.0.1" and the pytorch version if facing with CUDA version error.
 
 
-
 ## Deploy finetuned model on babel via vLLM
 ### Login with SSH key
 1. Add public ed25519 key to server
@@ -98,7 +97,26 @@ curl http://localhost:8000/v1/completions \
 ```
 
 ### Access deployed babel server on a local machine
-TODO
+1. Construct ssh tunnel between babel login node and babel compute node with hosted model
+```bash
+ssh -N -L 7662:localhost:8000 username@babel-x-xx
+```
+The above command creates a localhost:7662 server on bable login node which connects to localhost:8000 on compute node.
+
+2. Construct ssh tunnel between local machine and babel login node
+```bash
+ssh -N -L 8001:localhost:7662 username@<mycluster>
+```
+The above command creates a localhost:8001 server on your local machine which connects to localhost:7662 on babel login node.
+
+3. Call hosted model on local machine
+```bash
+curl http://localhost:8001/v1/models
+```
+If the above command runs successfully, you should be able to use REST API on your local machine.
+
+4. (optional) If you fail in building the ssh tunnel, you may add `-v` to the ssh command to see what went wrong.
+
 
 
 ### Userful resource links for babel