Replies: 1 comment
-
This could certainly be interesting, thanks for bringing this up. Implementing this is not completely trivial, as we have to think about some issues like how to avoid copies, how to correctly save and load, and how to deal with models where the sizes of the weights would differ. In fact, those are some of the challenges we ran into when implementing VeRA (see #1039), which also uses tied LoRA weights. In fact, at a very quick glance, the paper you cite could be considered a special case/variation of VeRA. Probably it would make most sense to figure out the VeRA PR first, then we can add this method either using a similar approach as VeRA, or as an optional argument for VeRA. |
Beta Was this translation helpful? Give feedback.
-
NVidia proposed Tied-LoRA https://arxiv.org/abs/2311.09578
The idea is to share A,B matrices for QKV's lora across the layers with possibility to freeze some of them (and they add
u
,v
vectors to process data going in and out of A,B):Any thoughts on this approach and maybe even its generalization? Eg splitting layers into groups so lets say first half of layers get one pair of A,B; second half gets its own parms and If
u
,v
are frozen and number of groups = number of layers, we get current LoRABeta Was this translation helpful? Give feedback.
All reactions