Updated on 14th August 2024
📝Article • Demo & Dataset on: 🤗Hugging Face
Large Language Models (LLMs) demonstrate significant capabilities but sometimes generate incorrect but believable responses when they lack information, and this is known as “hallucination.” It means they confidently provide information that may sound accurate but could be incorrect due to outdated knowledge.
Retrieval-Augmented Generation or RAG framework solves this problem by integrating an information retrieval system into the LLM pipeline. Instead of relying on pre-trained knowledge, RAG allows the model to dynamically fetch information from external knowledge sources when generating responses. This dynamic retrieval mechanism ensures that the information provided by the LLM is not only contextually relevant but also accurate and up-to-date.
This repository provides a collection of Jupyter notebooks that demonstrate how to build and experiment with RAG using different frameworks and tools.
Tool | LLMs | Description | Notebooks |
---|---|---|---|
Weaviate & LangChain | OpenAI | Build a question-answer system focused on providing answers related to the Roman Empire using Weaviate, LangChain, and OpenAI. | |
LangChain & LlamaIndex | OpenAI | Build basic and advanced document RAG workflow using LangChain, LlamaIndex and OpenAI article. | |
LangChain | Mixtral | Developed a chatbot that retrieves a summary related to the question from the vector database and generates the answer. | |
LangChain | llama-2 | Developed a machine learning expert chatbot (using Q&A dataset) that answers questions related to machine learning only without hallucinating. |
LangChain is a framework for building applications with LLMs. It provides abstractions and utilities for creating robust AI applications, such as chatbots, question-answering systems, and knowledge bases. LangChain offers customization options for adjusting the retrieval procedure to suit specific requirements. It generates multiple parallel queries to cover different aspects of the original query and retrieves relevant documents from a vector store.
LlamaIndex is a framework for building applications using large language models (LLMs). It provides tools for ingesting, managing, and querying data, allowing you to create "chat with your data" experiences. LlamaIndex integrates with vector databases like Weaviate to enable retrieval-augmented generation (RAG) systems, where the LLM is combined with an external storage provider to access specific facts and contextually relevant information.
Weaviate is a vector database that allows you to store and query data using semantic search. It provides a scalable and efficient way to manage large amounts of unstructured data, such as text, images, and audio. Weaviate uses machine learning models to encode data into high-dimensional vectors, enabling fast and accurate retrieval of relevant information based on semantic similarity.