-
Notifications
You must be signed in to change notification settings - Fork 354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add mt5 support #568
Comments
From what I can read, there is no structural difference between t5 and mt5 :
The implementation would then be a copy of T5* mixins or aliases to it ? Footnotes |
Yes, mT5 can basically re-use the mixins of T5 and copy its model integration. With the switch to the new codebase (see #584), we're open to new model integrations again (see updated guide). mT5 definitely would be a great addition! |
Hello! I am very interested on this. Is someone actively working on it? If not, I would like to help getting the implementation started. Edit: Just followed the updated guide and did a very quick port. I followed the approach that the mBART implementation took(they reused the BART mixins, I reused the T5 mixins) so the changes were very minimal. I will submit a pull request. I hope it works. |
🌟 New adapter setup
Model description
mT5-based models are multilingual equivalent of T5. Therefore, they are extremely useful to implement few-shot PEFT SOTA (as of summer 2023) such as LoRA or (IA)³ in multilingual contexts
https://huggingface.co/docs/transformers/model_doc/mt5 :
Open source status
Note
I am willing to help on this
The text was updated successfully, but these errors were encountered: