Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add instructions for running vLLM backend #8

Merged
merged 52 commits into from
Oct 18, 2023
Merged

Add instructions for running vLLM backend #8

merged 52 commits into from
Oct 18, 2023

Conversation

dyastremsky
Copy link
Contributor

Draft documentation to allow users to quickly use the vLLM backend to run their models.

@dyastremsky dyastremsky self-assigned this Oct 10, 2023
samples/client.py Fixed Show fixed Hide fixed
@dyastremsky dyastremsky changed the title Draft README and samples Draft README and samples for vLLM backend Oct 10, 2023
@dyastremsky dyastremsky changed the title Draft README and samples for vLLM backend Add instructions for running vLLM backend Oct 10, 2023
samples/client.py Fixed Show fixed Hide fixed
samples/client.py Fixed Show fixed Hide fixed
samples/client.py Fixed Show fixed Hide fixed
README.md Outdated Show resolved Hide resolved
Comment on lines +89 to +92
```
mkdir -p /opt/tritonserver/backends/vllm
wget -P /opt/tritonserver/backends/vllm https://raw.githubusercontent.com/triton-inference-server/vllm_backend/main/src/model.py
```
Copy link
Contributor

@rmccorm4 rmccorm4 Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not an action item here, but a random food for thought that could be nice for both users and developers. If we standardize on a certain python-based-backend git repository structure, we can do something like:

git clone https://github.com/triton-inference-server/vllm_backend.git /opt/tritonserver/backends
  1. Single command
  2. Developers could iterate on the backend directly in the git repo and just reload triton without copying files/builds around (developer experience)
  3. More support for multi-file implementations. The wget is nice, but won't scale past a single file. Ex: Imagine model.py implements TritonPythonModel but imports implementation.py that has all the gorey details for certain features.

Just some random Tuesday ideas in my head. Core would just be updated to also look for src/model.py or whatever standard we set instead of just model.py.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will not work with git clone, since required model.py is in sub-directory of vllm_backend, plus clone will clone tests as well.

We can discuss the best solution at some point.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the ease of development, I think your earlier idea of symlinks makes more sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will not work with git clone, since required model.py is in sub-directory of vllm_backend, plus clone will clone tests as well.

I know it won't work as-is and would require minor changes. Not necessarily asking for this feature at this time, just food for thought.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a separate goal of improving python backend developer experience (more for things like debugging, ipdb, etc) somewhere in the pipeline, so this came to mind as a tangential idea.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, by any chance, do you know in what ticket this is tracked? If you don't remember, then no worries

dyastremsky and others added 3 commits October 17, 2023 13:07
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
README.md Outdated Show resolved Hide resolved
@tanmayv25
Copy link
Contributor

LGTM besides a minor suggestion. Great work @dyastremsky !

Co-authored-by: Tanmay Verma <tanmay2592@gmail.com>
README.md Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
dyastremsky and others added 4 commits October 18, 2023 07:54
Co-authored-by: Neelay Shah <neelays@nvidia.com>
Co-authored-by: Neelay Shah <neelays@nvidia.com>
Co-authored-by: Neelay Shah <neelays@nvidia.com>
Co-authored-by: Neelay Shah <neelays@nvidia.com>
@oandreeva-nv
Copy link
Collaborator

Amazing work on this, @dyastremsky !

tanmayv25
tanmayv25 previously approved these changes Oct 18, 2023
@dyastremsky dyastremsky merged commit 912896b into main Oct 18, 2023
3 checks passed
dyastremsky added a commit that referenced this pull request Oct 18, 2023
Co-authored-by: Neelay Shah <neelays@nvidia.com>
Co-authored-by: Olga Andreeva <124622579+oandreeva-nv@users.noreply.github.com>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
Co-authored-by: Tanmay Verma <tanmay2592@gmail.com>
@dyastremsky dyastremsky deleted the dyas-README branch October 18, 2023 23:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants