Semantic search seeks to improve search accuracy by understanding the content of the search query. In contrast to traditional search engines which only find documents based on lexical matches, semantic search can also find synonyms.
-
Why semantic search in LLaMa Index?
- LLM cannot support large tokens, so we need to use semantic search to find similar documents or paragraphs. This strategy can help us reduce the number of tokens to a reasonable size.
- Because it is a very powerful tool to find similar documents. It is based on the semantic similarity of the documents, not on the lexical similarity. This means that it can find documents that are not lexically similar but are semantically similar. For example, if you search for "car", it will find documents that contain "automobile" or "vehicle". This is very useful for finding documents that are not lexically similar but are semantically similar.
-
Why use HuggingFaceEmbeddings? The handy model overview table in the documentation indicates that the
sentence-transformers/all-mpnet-base-v2
checkpoint has the best performance for semantic search, so we’ll use that for our application. -
What's the reranking and why we need it?
More details on: