Embedding Adapter 💬 📐

Finetune embedding models in just 4 lines of code.

Quick Start ⚡

Installation

pip install embedding_adapter

Usage

from embedding_adapter import EmbeddingAdapter
adapter = EmbeddingAdapter()
adapter.fit(query_embeddings, document_embeddings, labels)
adapter.transform(new_embeddings)

Once you've trained the adapter, you can use patch your pre-trained embedding model.

patch = adapter.patch()
adapted_embeddings = patch(original_embedding_fn("SAMPLE_TEXT"))

Use Cases/Why do I need to tune my embeddings ❓

Embeddings are predominantly utilized for Retrieval Augmented Generation (RAG) or semantic search applications. However, their effectiveness can significantly vary depending on the context. This is where the need for tuning comes into play.

Consider training an adaptor for your pre-trained embedding model, such as OpenAI's text-embedding-3-small or the open-source gte-large. This customization enables your model to interpret tokens accurately within the specific context of your application. For example, the word "Pandas" 🐼 could refer to the animal or the widely used Python library for data manipulation. Without tuning, your model may not distinguish between these vastly different contexts.

Moreover, tuning your embeddings is crucial if you aim to utilize a smaller model—perhaps due to hardware constraints like the absence of GPUs for inference. In such cases, an adaptor can enhance retrieval performance, ensuring efficiency without compromising on accuracy.

Synthetic Label Generation 🧪

No user feedback to use as labels? 🤔 Create synthetic labels with the LabelGenerator util

from embedding_adapter.utils import LabelGenerator
generator = LabelGenerator()
generator.run()

Note: This requires an OpenAI API key saved as an OPENAI_API_KEY env var.

License 📄

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github		.github
embedding_adapter		embedding_adapter
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Embedding Adapter 💬 📐

Quick Start ⚡

Use Cases/Why do I need to tune my embeddings ❓

Synthetic Label Generation 🧪

License 📄

About

Releases

Languages

License

gabrielchua/embedding-adapter

Folders and files

Latest commit

History

Repository files navigation

Embedding Adapter 💬 📐

Quick Start ⚡

Use Cases/Why do I need to tune my embeddings ❓

Synthetic Label Generation 🧪

License 📄

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Languages