Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lora Mask based on lora index #348

Merged
merged 2 commits into from
Oct 3, 2024

Conversation

hlahkar
Copy link

@hlahkar hlahkar commented Sep 30, 2024

Changes the filling of lora mask from lora_id to lora_index. This is needed to ensure that the mask does not fail in case lora id is greater than max_loras

@michalkuligowski michalkuligowski added the intel Issues or PRs submitted by Intel label Sep 30, 2024
end_pos = start_pos + self.lora_config.max_lora_rank
lora_mask[i, start_pos:end_pos] = ones
lora_mask = lora_mask.to('hpu')
lora_logits_mask = lora_mask

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why now logits mask is pointing to mask, before it was left None for decode phase.
Is that because now in line 1906 there can be reference to an object that is None? If so I would rather add there check if lora_logits_mask is not none instead of changing the logic of what lora_logits_mask is keeping

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michalkuligowski for decode phase lora_mask and logits_mask are same; previously we were setting it at execute_model; just moved to create_lora_mask for a cleaner code

@hlahkar hlahkar force-pushed the dev/hlahkar/lora_index_fix branch 7 times, most recently from cdee227 to c2572b9 Compare September 30, 2024 12:49
Copy link

@vivekgoe vivekgoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also check if these changes do not break long lora context test. We may need to port these changes to long lora context branch to check that (maybe Sanju/Ruheena can do this check).

vllm/worker/habana_model_runner.py Show resolved Hide resolved
vllm/worker/habana_model_runner.py Outdated Show resolved Hide resolved
vllm/worker/habana_model_runner.py Outdated Show resolved Hide resolved
vllm/worker/habana_model_runner.py Show resolved Hide resolved
@vivekgoe vivekgoe added habana Issues or PRs submitted by Habana Labs and removed intel Issues or PRs submitted by Intel labels Oct 1, 2024
Copy link

@vivekgoe vivekgoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@madamczykhabana madamczykhabana merged commit da03d8b into habana_main Oct 3, 2024
19 checks passed
@hlahkar hlahkar deleted the dev/hlahkar/lora_index_fix branch October 16, 2024 08:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
habana Issues or PRs submitted by Habana Labs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants