- min/maxing chunk size
- larger chunks -> better data BUT longer load times for LLMs & could have irrelevant info
- smaller chunks -> less data BUT shorter load times for LLMs
- High-level tasks like summarization requires bigger chunk size and low-level tasks like coding requires smaller chunks
- pre-processing
- before data is fed into the database, it must be stripped of 'stop' words & special characters
- html tags, 'the', 'a', general rubbish
- improve quality of indexed data
- replace pronouns with names if possible (increases sementic search results)
- metadata
- allows for future optimizations of time & data source in retrevial
- before data is fed into the database, it must be stripped of 'stop' words & special characters
- buckets
- store different topics into different DBs - aka 'buckets'
- "Sentence-Window Retrival"
- Query Rewritting
- use LLM to rephrase a user's layered, multi-use question to n-queries
- Multi-Query Retrival
- Step-Back Prompting
- Hybrid Search Exploration
- include alternate methods of searching
- keyword search, semantic search, vector search, etc.
- use a "sparse retriever" like BM25 or TF-IDF with a dense retriever (embedding)
- Re-Rank & Filter Documents before sending to LLM
- despite having a high score in db, does not mean its a good match
- rerank using smth like Cohere or HuggingFace & filter out those that you dont need
- Document Compressors