-
Notifications
You must be signed in to change notification settings - Fork 26.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Any config for DeBERTa series as decoders for TSDAE? #31688
Comments
Hi @bobox2997, thanks for opening this PR! What I would suggest is adding checkpoints, configs and possibly updated modeling files directly on the hub, and have as much support as we can there. It will be easier to integrate than directly into transformers. Here is a tutorial if that sound good to you! |
I'm not sure that I understood that correctly... What checkpoint should I add? On TSDAE implementation the decoder portion is tied to the encoder and is not used at inference, I just need an "is_decoder" argument (and related configs of course) in the config of DeBERTa as there is for BERT, RoBERTa et similia models... I'm sorry if those are naive or dumb questions, I'm still learning. Thank you so much for your time! |
@bobox2997 Ah, OK, I thought there were checkpoints available trained with this method. In terms of changes in the transformers library, we're very unlikely to accept changes to the architecture or configurations files to add new features like this, especially for older, popular models and anything which doesn't have official checkpoints available. The great thing about open source is that you are free to build upon and adapt the available code (license permitting) for your own projects. It should be possible to add as a new architecture on the hub, keeping compatibility with the transformers library and allowing you to use the same API. If you or anyone else in the community would like to implement this, feel free to share you project here! |
Feature request
Seems that there is no config for DeBERTa v1-2-3 as decoder (while there are configs for BERT/RoBERTa et similia models)... This is needed in order to perform TSDAE unsupervised fine tuning.
(TSDAE: Using Transformer-based Sequential Denoising Auto-Encoder for Unsupervised Sentence Embedding Learning)
Motivation
Here a reference for the related sentence transformer issue: UKPLab/sentence-transformers#2771
TSDAE demonstrated to be a powerful unsupervised approach, and DeBERTa is proven to be a really strong base model for further fine tuning (also, v2 has a an xxlarge 1.5B version and v3 demonstrated strong performance and efficiency with its ELECTRA-style pretraining)
For context, here the TSDAE paper: https://arxiv.org/abs/2104.06979
Your contribution
I'm not sure if I can contribute to the repository...
Anyway, i can certainly open source multiple domain adapted models, including models in a size range (1.5B) where there is not much choices while working with encoder-only models
Edited
The text was updated successfully, but these errors were encountered: