Why was the LoRA option removed from exllamav2 #1267
psych0v0yager
started this conversation in
Feature requests
Replies: 1 comment
-
I don't remember why this was changed, so this might have been a mistake. We are happy to support these changes! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I previously added the ability to swap LoRAs on exllamav2 but it is not present in the current version. What was the reason for this change, and are there possible improvements I could make to add LoRAs back to exllama
EDIT: I noticed exllama refactored their LoRA system. exllamav2 now uses
generator.set_loras(lora)
to add a LoRA rather than the old.generate_simple(prompt_, settings, max_new_tokens, loras = lora_)
.I can add this feature back if Outlines is willing to support it. One caveat is LoRAs cannot be swapped while there is a dynamic job in place. However, if no jobs are taking place, it should be a fairly simple swap
EDIT 2: Exllamav2 also has a new q6 cache. I can add support for this as well.
Please let me know if you approve of these changes
Beta Was this translation helpful? Give feedback.
All reactions