You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to reproduce the T5.1.1 fine-tuning on GLUE result from this paper https://arxiv.org/pdf/2110.08529.pdf. I read through the configuration and source code implementation of Adafactor from this repo to ensure that I'm using consistent hyper-parameters while training using hugging face and pytorch.
However, I can't reproduce the performance. I have preprocessed the GLUE dataset as something like <task_name> sentence1: <sentence1> sentence2: <sentence2> <EOS>
following the SeqIO GLUE preprocessor in the T5 repo.
Another reason I could think of that can cause this performance drop is that when fine-tuning from .gin config file, the optimizer state is also loaded from the checkpoint and then the fine-tuning happens. But this is infeasible to reproduce if one only has access to publicly accessible checkpoints.
I tried to run this repo and add breakpoints but didn't get it. Could someone help me confirm if the optimizer state is loaded if we run fine-tuning scripts from this repo?
The text was updated successfully, but these errors were encountered:
I'm trying to reproduce the T5.1.1 fine-tuning on GLUE result from this paper https://arxiv.org/pdf/2110.08529.pdf. I read through the configuration and source code implementation of Adafactor from this repo to ensure that I'm using consistent hyper-parameters while training using hugging face and pytorch.
However, I can't reproduce the performance. I have preprocessed the GLUE dataset as something like
<task_name> sentence1: <sentence1> sentence2: <sentence2> <EOS>
following the SeqIO GLUE preprocessor in the T5 repo.
Another reason I could think of that can cause this performance drop is that when fine-tuning from
.gin
config file, the optimizer state is also loaded from the checkpoint and then the fine-tuning happens. But this is infeasible to reproduce if one only has access to publicly accessible checkpoints.I tried to run this repo and add breakpoints but didn't get it. Could someone help me confirm if the optimizer state is loaded if we run fine-tuning scripts from this repo?
The text was updated successfully, but these errors were encountered: