Lora rank & alpha #2037
Unanswered
BigDataMLexplorer
asked this question in
Q&A
Replies: 1 comment 2 replies
-
When using rslora, the LoRA output will be scaled by a factor of |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, I'm training the Llama3 8b model. I did many trials with lora rank = 16 and different aplhas -> (32, 16 and 8). In my case the best result was with aplha 8. I did not use rslora in this testing.
Even assuming I have the best aplha value, can it still help me to use use_rslora=True in this configuration? If I have aplha set to 8, what aplha will be used when rslora? I didn't quite get it from the huggingface article.
In general, I read that a higher rank in lora should capture more nuances, because more parameters will be trained.
That's why I also tried to increase lora's rank to 256 and leave aplha at half (128). Of course with a proper learning rate, otherwise the results would be very bad. I already used use_rslora=True here. The result was 1% percentage point worse than rank 16. The result was worse than rank 16 even though I didn't use rslora.
Do you think I may have already reached the optimum or should I do something different when using rslora?
Thank you
Beta Was this translation helpful? Give feedback.
All reactions