You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Language I am using the model on (English, Chinese ...): any
Adapter setup I am using (if any): Affects all adapter methods reliant on ForwardContext: Reft, Prefix-Tuning, Prompt Tuning, Fusion, Parallel composition
To reproduce
When enabling gradient checkpointing before adapter training, ie:
model.gradient_checkpointing_enable()
ForwardContext will not be correctly set during forward/ backward passes. This means all functionality depending on ForwardContext will not work together gradient checkpointing. This affects some adapter types (reft, prompt tuning, prefix tuning; these won't work with gradient checkpointing currently) but not others (lora, bottleneck), also affects composition such as fusion and parallel.
E.g. will throw this error:
AttributeError: 'NoneType' object has no attribute 'output_adapter_gating_scores'
Environment info
adapters
version: latest mainInformation
Model I am using (Bert, XLNet ...): any
Language I am using the model on (English, Chinese ...): any
Adapter setup I am using (if any): Affects all adapter methods reliant on ForwardContext: Reft, Prefix-Tuning, Prompt Tuning, Fusion, Parallel composition
To reproduce
When enabling gradient checkpointing before adapter training, ie:
ForwardContext will not be correctly set during forward/ backward passes. This means all functionality depending on ForwardContext will not work together gradient checkpointing. This affects some adapter types (reft, prompt tuning, prefix tuning; these won't work with gradient checkpointing currently) but not others (lora, bottleneck), also affects composition such as fusion and parallel.
E.g. will throw this error:
Also see #677.
To reproduce, try training ReFT using the QLoRA Llama notebook and gradient checkpointing enabled.
The text was updated successfully, but these errors were encountered: