Skip to content

Commit

Permalink
Add instructions
Browse files Browse the repository at this point in the history
  • Loading branch information
cthiriet committed Apr 19, 2024
1 parent 3a404fa commit 0287678
Showing 1 changed file with 30 additions and 3 deletions.
33 changes: 30 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,46 @@

This repo is a fork of the [vLLM](https://github.com/vllm-project/vllm) repo.

## Build the image
## Usage

Pull the `latest` image from ECR:

```bash
bash docker/pull.sh vllm:latest
```

Run the container (with Command-R model in this case):

```bash
docker run --runtime nvidia --gpus all \
-v ~/.cache/huggingface:/root/.cache/huggingface \
-p 8000:8000 \
--ipc=host \
-e SERVED_MODEL_NAME=command-r \
-e TRUST_REMOTE_CODE=false \
-e MODEL=CohereForAI/c4ai-command-r-v01 \
vllm \
--tensor-parallel-size 4 \
--host 0.0.0.0
```

## Development

### Build the image

```bash
sh docker/build.sh
```

## Deploy the image to ECR
### Deploy the image to ECR

Once your changes are ready, you can deploy the image to ECR:

```bash
sh docker/deploy.sh
```

## Upgrade version
### Upgrade version

You can upgrade the version of vLLM by rebasing on the official repo:

Expand Down

0 comments on commit 0287678

Please sign in to comment.