update vllm_hpu_extension commit to 24039a3 #490

ccrhx4 · 2024-11-13T02:04:10Z

to fix part of the #443

update the vllm_hpu_extension commit id to HabanaAI/vllm-hpu-extension@24039a3

…sedSDPA

requirements-hpu.txt

madamczykhabana

Revert for PR26 is pending: HabanaAI/vllm-hpu-extension#31

michalkuligowski · 2024-11-14T10:14:07Z

This needs to be rejected due to: HabanaAI/vllm-hpu-extension#31

ccrhx4 · 2024-11-15T01:37:04Z

Inference accuracy is at the risk of being compromised when the softmax_mode == "fast".

From the Habana document, https://docs.habana.ai/en/latest/PyTorch/Model_Optimization_PyTorch/Optimization_in_PyTorch_Models.html#using-fused-sdpa

Using fast Softmax may affect inference accuracy.
Only BF16 data type is supported with fast Softmax.
Fast Softmax is not supported when running training in recompute mode with is_causal = False.

Even in the optimum-habana, the softmax_mode is default as "None". https://github.com/huggingface/optimum-habana/blob/f488ab66329b2c3a46063292b2822c1cab068756/optimum/habana/transformers/generation/configuration_utils.py#L38

Please reconsider this issue, I understand the performance is important, but the model needs to be performed correctly as it has been trained to.

update vllm_hpu_extension commit: do not use softmax :fastmode for Fu…

63ca241

…sedSDPA

ccrhx4 mentioned this pull request Nov 13, 2024

do not use softmax fast mode in FusedSDPA HabanaAI/vllm-hpu-extension#26

Merged

michalkuligowski requested changes Nov 13, 2024

View reviewed changes

requirements-hpu.txt Outdated Show resolved Hide resolved

fix the commit id of vllm_hpu_extension

aa62c0c

ccrhx4 requested a review from michalkuligowski November 14, 2024 00:59

michalkuligowski approved these changes Nov 14, 2024

View reviewed changes

madamczykhabana requested changes Nov 14, 2024

View reviewed changes

ccrhx4 requested a review from madamczykhabana November 15, 2024 01:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update vllm_hpu_extension commit to 24039a3 #490

update vllm_hpu_extension commit to 24039a3 #490

ccrhx4 commented Nov 13, 2024

madamczykhabana left a comment

michalkuligowski commented Nov 14, 2024

ccrhx4 commented Nov 15, 2024

update vllm_hpu_extension commit to 24039a3 #490

Are you sure you want to change the base?

update vllm_hpu_extension commit to 24039a3 #490

Conversation

ccrhx4 commented Nov 13, 2024

madamczykhabana left a comment

Choose a reason for hiding this comment

michalkuligowski commented Nov 14, 2024

ccrhx4 commented Nov 15, 2024