From dfa8a7414787699e33e821ba857bcc4eea0d28cf Mon Sep 17 00:00:00 2001 From: gaardhus Date: Mon, 25 Nov 2024 15:01:26 +0100 Subject: [PATCH] update placeholders --- docs/server/guides/llms.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/server/guides/llms.md b/docs/server/guides/llms.md index 4725e55..275d2c5 100644 --- a/docs/server/guides/llms.md +++ b/docs/server/guides/llms.md @@ -14,20 +14,20 @@ uv add vllm setuptools ## Inference server On the server, in an active Slurm session, run the following command to start -the inference server: +the inference server with the specified model from huggingface: ```bash vllm serve "allenai/OLMo-7B-0724-Instruct-hf" \ #(1)! --host=10.84.10.216 \ #(2)! --port=8880 \ #(3)! - --download-dir=/projects/ainterviewer-AUDIT/data/.cache/huggingface \ #(4)! + --download-dir=/projects//data/.cache/huggingface \ #(4)! --dtype=half #(5)! ``` 1. The model name from huggingface 2. The ip address of the slurm gpu server 3. The port of the slurm gpu server -4. Local cache dir for models +4. Local cache dir for models, remember to substitute with a specific project eg. `ainterviewer-AUDIT` 5. For some models, this is needed since the GPUs on the server are a bit old !!! tip