Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to use this library with sentence transformers? #548

Closed
ydennisy opened this issue May 20, 2023 · 5 comments
Closed

Is it possible to use this library with sentence transformers? #548

ydennisy opened this issue May 20, 2023 · 5 comments
Labels
question Further information is requested

Comments

@ydennisy
Copy link

Hi!

Amazing library and thanks for your work :)

I would like to use adapter with Sentence Transformers https://www.sbert.net/

I am able to load some of the pre-trained sbert models and use adapter to fine tune for tasks such as classification. However my downstream task is actually the production of quality embeddings for queries and text - so I would like to use the original multiple negatives ranking loss or triplet loss used in sbert.

Any help would be greatly appreciated!!!

@ydennisy ydennisy added the question Further information is requested label May 20, 2023
@leoplusx
Copy link

I'll second this question.

@ydennisy You say you're looking to generate quality embeddings.

I'd imagine there are two fine-tuning tasks involved:

  1. Domain adaptation using unlabelled data
  2. Training on labelled data triplets (query, positive, negative)

Also, have you thought about fine-tuning the underlying model instead? So, instead of taking a packaged sentence-transformer, taking the underlying model (e.g. BERT), fine-tuning that using MLM (masked language modelling), and only then using the sentence-transformer library to make that model into a sentence-transformer, which you then can train on labelled data.

Haven't done this yet myself.

@ydennisy
Copy link
Author

Hey @leobaumgardt

Yeah those are the steps I would like to take, that is why adapters seemed a good fit as there are tasks which can be done for each of these and then stacked.

Hmm, not I have not explored that option yet - it does sound promising. However I would like to avoid porting models from HF to ST if at all possible. I find the ST library very simple to use but it is not overly flexible.

Two ideas I am exploring:

Let me know if you have the same issues, would be happy to colab :)

@lenglaender
Copy link
Member

lenglaender commented Jun 5, 2023

Hey @ydennisy,
adapter-transformers does not support sentence-transformers. But since sentence-transformers also uses HuggingFace transformer modules, it works if the models you use are supported by us (have a look at our model overview). For this reason you can load some of the pre-trained models of sentence-transformers and add adapters to them.
The loss functions you mentioned are not implemented in HuggingFace Transformers and therefore not available in adapter-transformers. If you are using AdapterTrainer, you could for example create a CustomTrainer (via class CustomTrainer(AdapterTrainer): ...) and add your custom loss by overwriting the compute_loss method (described for the HF Trainer class here: https://huggingface.co/docs/transformers/main_classes/trainer).

Also, we would be thankful if you could provide your solution once it works. So that others looking for the same problem can see how to solve it.

@adapter-hub-bert
Copy link
Member

This issue has been automatically marked as stale because it has been without activity for 90 days. This issue will be closed in 14 days unless you comment or remove the stale label.

@adapter-hub-bert
Copy link
Member

This issue was closed because it was stale for 14 days without any activity.

@adapter-hub-bert adapter-hub-bert closed this as not planned Won't fix, can't repro, duplicate, stale Sep 19, 2023
@calpt calpt removed the Stale label Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants