Connect faiss with sqlite. Keep vectors in faiss, data in sqlite. Planning to add support for fts and may be alternatives for faiss. Why use this? For RAG, libraries like langchain are overcomplicated, going through docs and changing stuff is hard, and on top of that they change frequently. I prefer to have much more control on my pipeline and if you feel the same, this code might be a good starting point.
- Creating the Index: Use
IndexIDMap2
over flat(IndexFlatL2) to create and manage vectors. - Serializing the Index: Serialize the FAISS index to save it locally.
- Deserializing the Index: Load the FAISS index from the serialized file when needed.
- Creating the Database: Initialize a SQLite database with a table to store metadata.
- Inserting Metadata: Add records with IDs that correspond to FAISS vector IDs.
- Querying Metadata: Retrieve metadata based on vector IDs obtained from FAISS searches.
Perform Search: - Query FAISS to find nearest neighbors for a given vector. - Use the resulting IDs to query SQLite and retrieve metadata.
For faiss, check its github repo for instructions on how to install it. Sqlite comes with Python.
If you have an interesting project, you may connect with me on https://www.linkedin.com/in/mayankladdha31/ Please star this repo if you found it useful.