Deploying a New Model to NDIF

1. Clone NDIF and switch to `dev` branch

git clone https://github.com/ndif-team/ndif.git && cd ndif && git checkout dev

2. Install Miniconda

Download and install Miniconda by following the instructions on the official Miniconda page for Linux.

3. Create and active the service environment

conda env create -f ../ndif/services/ray_head/environment.yml -n prod
conda activate prod

4. Install NNSight

To ensure that the correct version of NNSight is present, do the following:

pip uninstall nnsight && git clone https://github.com/ndif-team/nnsight.git
pip install -e nnsight
cd nnsight && git checkout 0.3

5. Install Ray Serve

In order for Ray to work, you need all nodes to have the same version of Ray installed. Here is what we have been using:

pip install ray[serve]==2.34

6. Create a HF Cache

Choose a location for your huggingface cache (if you don't already have one)

touch .hf_config

7. Create `env.sh` Script

Create a script named env.sh in your working directory with the following content (make sure to modify your environment variables appropriately):

#! /bin/bash

huggingface-cli login --token hf-token

export PYTHONPATH=/path/to/ndif/services/ray_worker
export HF_HOME=/path/to/.hf_config
export RAY_ADDRESS=head-node-ip:6379
export NCCL_IB_DISABLE=1

Replace hf-token with your actual huggingface token.
Replace /path/to/.hf_config with the actual path to the Hugging Face cache you previously made.
Replace head-node-ip with the IP address of your Ray head node.

8. Source `env.sh`

Source the environment variables from env.sh:

source env.sh

9. Download the model weights

The easiest way to do this is to create a Python script which uses NNSight to load a model:

import nnsight

model = nnsight.LanguageModel('{model-checkpoint}' , dispatch=True)

with model.trace('ayy') as tracer:
  out = tracer.output.save()

Save the following to download.py and run python3 download.py. You can stop the script once the model weights are downloaded. Make sure to replace {model-checkpoint} with the actual huggingface checkpoint.

10. Create `start.sh` script

Create a script named start.sh in your working directory with the following content:

#!/bin/bash

HOSTNAME=$(hostname)

source env.sh

resources=`python -m src.ray.resources --name $HOSTNAME`

ray start --resources "$resources" --address $RAY_ADDRESS --block

11. Run the script

This will start the model deployment. Using tmux ensures that the deployment continues running in the background, even if your terminal session disconnects.

tmux
conda activate prod
bash start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deploying a New Model to NDIF

Deploying a New Model to NDIF

1. Clone NDIF and switch to `dev` branch

2. Install Miniconda

3. Create and active the service environment

4. Install NNSight

5. Install Ray Serve

6. Create a HF Cache

7. Create `env.sh` Script

8. Source `env.sh`

9. Download the model weights

10. Create `start.sh` script

11. Run the script

Clone this wiki locally

Deploying a New Model to NDIF

Deploying a New Model to NDIF

1. Clone NDIF and switch to dev branch

2. Install Miniconda

3. Create and active the service environment

4. Install NNSight

5. Install Ray Serve

6. Create a HF Cache

7. Create env.sh Script

8. Source env.sh

9. Download the model weights

10. Create start.sh script

11. Run the script

Clone this wiki locally

1. Clone NDIF and switch to `dev` branch

7. Create `env.sh` Script

8. Source `env.sh`

10. Create `start.sh` script