-
Notifications
You must be signed in to change notification settings - Fork 865
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Feature add openai api for vllm integration (#3287)
* Forward additional url segments as url_paths in request header to model * Fix vllm test and clean preproc * First attept to enable OpenAI api for models served via vllm * fix streaming in openai api * Add OpenAIServingCompletion usage example * Add lora modules to vllm engine * Finish openai completion integration; removed req openai client; updated lora example to llama 3.1 * fix lint * Update mistral + llama3 vllm example * Remove openai client from url path test * Add openai chat api to vllm example * Added v1/models endpoint for vllm example * Remove accidential breakpoint() * Add comment to new url_path
- Loading branch information
Showing
16 changed files
with
633 additions
and
149 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
{ | ||
"model": "llama3-8b", | ||
"messages":[ | ||
{"role": "system", "content": "You are a helpful assistant."}, | ||
{"role": "user", "content": "Who won the world series in 2020?"}, | ||
{"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."}, | ||
{"role": "user", "content": "Where was it played?"} | ||
], | ||
"temperature":0.0, | ||
"max_tokens": 50 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,7 @@ | ||
{ | ||
"prompt": "A robot may not injure a human being", | ||
"max_new_tokens": 50, | ||
"temperature": 0.8, | ||
"logprobs": 1, | ||
"prompt_logprobs": 1, | ||
"max_tokens": 128, | ||
"adapter": "adapter_1" | ||
"model": "llama3-8b" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,7 @@ | ||
{ | ||
"model": "adapter_1", | ||
"prompt": "A robot may not injure a human being", | ||
"max_new_tokens": 50, | ||
"temperature": 0.8, | ||
"temperature": 0.0, | ||
"logprobs": 1, | ||
"prompt_logprobs": 1, | ||
"max_tokens": 128, | ||
"adapter": "adapter_1" | ||
"max_tokens": 128 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,3 +12,5 @@ handler: | |
max_model_len: 250 | ||
max_num_seqs: 16 | ||
tensor_parallel_size: 4 | ||
served_model_name: | ||
- "mistral" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,7 @@ | ||
{ | ||
"model": "mistral", | ||
"prompt": "A robot may not injure a human being", | ||
"max_new_tokens": 50, | ||
"temperature": 0.8, | ||
"logprobs": 1, | ||
"prompt_logprobs": 1, | ||
"max_tokens": 128 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.