Welcome to Verba: The Golden RAGtriever, an open-source initiative designed to offer a streamlined, user-friendly interface for Retrieval-Augmented Generation (RAG) applications. In just a few easy steps, dive into your data and make meaningful interactions!
pip install goldenverba
Verba is more than just a tool—it's a personal assistant for querying and interacting with your data. Have questions about your documents? Need to cross-reference multiple data points? Want to gain insights from your existing knowledge base? Verba makes it all possible through the power of Weaviate and Large Language Models (LLMs)!
Built on top of Weaviate's state-of-the-art Generative Search technology, Verba fetches relevant context from your documents to answer queries. It leverages the computational strength of LLMs to offer comprehensive, contextually relevant answers. All of this is conveniently accessible through Verba's intuitive user interface.
Verba offers seamless data import functionality, supporting a diverse range of file types including .txt
, .md
, and more. Before feeding your data into Weaviate, our system handles chunking and vectorization to optimize it for search and retrieval.
🔧 Work in Progress: We are actively developing a data cleaning pipeline for custom datasets. Until then, please ensure your data is clean and well-structured before importing it into Weaviate.
Harness the power of Weaviate's generate module and hybrid search features when using Verba. These advanced search techniques sift through your documents to identify contextually relevant fragments, which are then used by Large Language Models to formulate comprehensive answers to your queries.
Verba utilizes Weaviate's Semantic Cache to embed both the generated results and queries, making future searches incredibly efficient. When you ask a question, Verba will first check the Semantic Cache to see if a semantically identical query has already been processed.
This section outlines various methods to set up and deploy Verba, so you can choose the one that fits you best:
- Deploy with
pip
- Build from Source
- Use Docker for Deployment
Prerequisites: If you're not using Docker, ensure that you have Python >=3.9.0 installed on your system.
🔑 API Key Requirement: Regardless of the deployment method, you'll need an OpenAI API key to enable data ingestion and querying features. You can specify this by either creating a .env file when cloning the project, or by storing the API key in your system environment variables.
- Initialize a new Python Environment
python3 -m virtualenv venv
- Install Verba
pip install goldenverba
- Launch Verba
verba start
- Clone the Verba repos
git clone https://github.com/weaviate/Verba.git
- Initialize a new Python Environment
python3 -m virtualenv venv
- Install Verba
pip install -e .
- Launch Verba
verba start
If you're unfamiliar with Docker, you can learn more about it here.
- Clone the Verba repos
git clone https://github.com/weaviate/Verba.git
- Deploy using Docker
docker-compose up
Verba provides flexibility in connecting to Weaviate instances based on your needs. By default, Verba opts for Weaviate Embedded if it doesn't detect the VERBA_URL
and VERBA_API_KEY
environment variables. This local deployment is the most straightforward way to launch your Weaviate database for prototyping and testing.
However, you have other compelling options to consider:
🌩️ Weaviate Cloud Service (WCS)
If you prefer a cloud-based solution, Weaviate Cloud Service (WCS) offers a scalable, managed environment. Learn how to set up a cloud cluster by following the Weaviate Cluster Setup Guide.
🐳 Docker Deployment Another robust local alternative is deploying Weaviate using Docker. For more details, consult the Weaviate Docker Guide.
🌿 Environment Variable Configuration Regardless of your chosen deployment method, you'll need to specify the following environment variables. These can either be added to a .env file in your project directory or set as global environment variables on your system:
VERBA_URL=http://your-weaviate-server:8080
VERBA_API_KEY=your-weaviate-database-key
Verba offers straightforward commands to import your data for further interaction. Before you proceed, please be aware that importing data will incur costs based on your configured OpenAI access key.
Important Notes: Supported file types are currently limited to .txt, .md, and .mdx. Additional formats are in development. Basic CRUD operations and UI interactions are also in the pipeline.
verba start --model "gpt-3.5-turbo" # Initiates Verba application
verba import --path "Path to your dir or file" --model "gpt-3.5-turbo" --clear True # Imports data into Verba
verba clear # Deletes all data within Verba
verba clear_cache # Removes cached data in Verba
If you've cloned the repository, you can get a quick start with sample datasets in the ./data
directory. Use verba import --path ./data
to import these samples. You can also populate Verba with predefined suggestions using a JSON list via verba import --path suggestions.json
. An example is provided in the ./data/minecraft
directory.
Verba exclusively utilizes OpenAI models. Be advised that the usage costs for these models will be billed to the API access key you provide. Primarily, costs are incurred during data embedding and answer generation processes. The default vectorization engine for this project is Ada v2
.
Verba is built on three primary components:
- Weaviate Database: You have the option to host on Weaviate Cloud Service (WCS) or run it locally.
- FastAPI Endpoint: Acts as the communication bridge between the Large Language Model provider and the Weaviate database.
- React Frontend (Static served through FastAPI): Offers an interactive UI to display and interact with your data. Development
Note: If you're planning to modify the frontend, ensure you have Node.js version >=18.16.0 installed. For more details on setting up the frontend, check out the Frontend README.
Your contributions are always welcome! Feel free to contribute ideas, feedback, or create issues and bug reports if you find any! Visit our Weaviate Community Forum if you need any help!