How to pass instructions for Instruction-based embedding models #293

axeloh · 2024-07-01T14:44:58Z

axeloh
Jul 1, 2024

Hi there :)

I am looking into deploying several embedding models using Infinity on Runpod Serverless (https://github.com/runpod-workers/worker-infinity-embedding?tab=readme-ov-file).
Everything wrt the deployment seems to work fine.

However, many of these models that I am testing out are Instruction-based models, and so I should pass instructions along with any text to embed. What is the correct way of doing this using Infinity Runpod workers?
Also, from what I understand from reading on HF, the different models may differ in the way that the "raw" model should receive instructions.

For instance, InstructOR expects tuples with (instruction, text) pairs, while Salesforce's Mistral expects that the instruction is part of the text (ala "Instruction: ... Text: ...").

Answered by michaelfeil

Jul 1, 2024

Its just a prompt template. Unless up-streamed in sentence-transformers & properly specified in config.json, I wish to not accommodate prompt templates in infinity.

Instructor expects a special tuple, but ultimatley also formats it using a similar function. The instructor models are quite outdated, and Mistral to large, I would recommend using Bert/Deberta/.. models!

For SFT Mistral the template is the following:
https://huggingface.co/Salesforce/SFR-Embedding-Mistral

def get_detailed_instruct(task_description: str, query: str) -> str:
    return f'Instruct: {task_description}\nQuery: {query}'

Please manage this piece of code client side.

View full answer

michaelfeil · 2024-07-01T16:12:19Z

michaelfeil
Jul 1, 2024
Maintainer

Its just a prompt template. Unless up-streamed in sentence-transformers & properly specified in config.json, I wish to not accommodate prompt templates in infinity.

Instructor expects a special tuple, but ultimatley also formats it using a similar function. The instructor models are quite outdated, and Mistral to large, I would recommend using Bert/Deberta/.. models!

For SFT Mistral the template is the following:
https://huggingface.co/Salesforce/SFR-Embedding-Mistral

def get_detailed_instruct(task_description: str, query: str) -> str:
    return f'Instruct: {task_description}\nQuery: {query}'

Please manage this piece of code client side.

0 replies

axeloh · 2024-07-01T16:46:58Z

axeloh
Jul 1, 2024
Author

Thanks, agreed that it might be best to handle this on client side.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to pass instructions for Instruction-based embedding models #293

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

How to pass instructions for Instruction-based embedding models #293

axeloh Jul 1, 2024

Replies: 2 comments

michaelfeil Jul 1, 2024 Maintainer

axeloh Jul 1, 2024 Author

axeloh
Jul 1, 2024

michaelfeil
Jul 1, 2024
Maintainer

axeloh
Jul 1, 2024
Author