You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I train AdapterFusion with the default configuration on a summarization task, the training loss suddenly increases after the first epoch and doesn't converge at last.
Due to the resource limitations, I could only train with a batch_size of 1 originally. Then I tried different hyperparameters and finally found that it works normally only when batch_size>=16. I also test batch_size=8, it's not good enough. (I didn't really add the batch_size, but apply Gradient Accumulation).
Just record the details here for someone who may experience the same confusion with me when training AdapterFusion!
The text was updated successfully, but these errors were encountered:
Details
When I train AdapterFusion with the default configuration on a summarization task, the training loss suddenly increases after the first epoch and doesn't converge at last.
Due to the resource limitations, I could only train with a batch_size of 1 originally. Then I tried different hyperparameters and finally found that it works normally only when batch_size>=16. I also test batch_size=8, it's not good enough. (I didn't really add the batch_size, but apply Gradient Accumulation).
Just record the details here for someone who may experience the same confusion with me when training AdapterFusion!
The text was updated successfully, but these errors were encountered: