Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mask based BGMV implementation #223

Merged
merged 9 commits into from
Sep 5, 2024
Merged

Conversation

hlahkar
Copy link

@hlahkar hlahkar commented Aug 30, 2024

Refactors BGMV implementation from gather based to mask-based to optimize performance and reduce device memory usage.

vllm/hpu/ops.py Outdated Show resolved Hide resolved
vllm/hpu/ops.py Show resolved Hide resolved
vllm/worker/habana_model_runner.py Outdated Show resolved Hide resolved
vllm/worker/habana_model_runner.py Outdated Show resolved Hide resolved
@hlahkar hlahkar force-pushed the dev/hlahkar/bgmv_poc branch 2 times, most recently from de3f36c to a8f1d7d Compare September 2, 2024 10:43
@vivekgoe vivekgoe merged commit 05acb89 into habana_main Sep 5, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants