Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lmi cpu container with vLLM #2009

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

lanking520
Copy link
Contributor

@lanking520 lanking520 commented Jun 1, 2024

Description

Support CPU container build for vLLM based LLM inference. Tested with LLAMA3-7B, worked, but extremely slow

engine=Python
option.rolling_batch=vllm
option.model_id=NousResearch/Hermes-2-Pro-Llama-3-8B
option.tensor_parallel_degree=1

@lanking520 lanking520 requested review from zachgk, frankfliu and a team as code owners June 1, 2024 18:24
@lanking520 lanking520 changed the title [WIP] lmi cpu container with vLLM lmi cpu container with vLLM Jun 3, 2024
VLLM_TARGET_DEVICE=cpu python3 setup.py bdist_wheel


FROM base AS lmi-cpu
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that there could only be one FROM in each Dockerfile, I may be wrong but just want to check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants