Skip to content

Latest commit

 

History

History
380 lines (354 loc) · 13.6 KB

README.md

File metadata and controls

380 lines (354 loc) · 13.6 KB

VECT.R8 (Vector Embeddings Creation, Transformation & Retrieval) 🚀

logo

A Web UI where you can upload CSV/JSON files, create vector embeddings, and query them. Soon, you'll be able to convert unstructured data to JSON/CSV using an integrated LLM.

VECTR8-demo-ezgif com-video-to-gif-converter

Project under heavy/active development [may be] unstable. Embeddings and Query pages WIP ⚠️

Table of Contents


Prerequisites

Requirement Description
Python Python 3.7+ The application requires Python 3.7 or higher to leverage modern libraries and syntax.
flask icon Flask Essential for running embedding models. Utilized by transformers.
Flask-CORS Flask-CORS Enables CORS for frontend-backend communication.
transformers transformers Used for creating vector embeddings with pre-trained models.
fire icon torch Essential for running embedding models. Utilized by transformers.
numpy numpy Handles arrays and mathematical operations. Used throughout the application.
pandas pandas Processes CSV and JSON files. Utilized throughout the application.

Installation

Step Instructions
Clone the repository
git clone https://github.com/itsPreto/VECTR8.git
cd VECTR8
Install the required packages
pip install -r requirements.txt

Running the Application

Step Instructions
Start the Flask server
python3 rag.py
Automatically launch React frontend The Python endpoint will launch the React frontend in a separate subprocess.
Open your web browser Navigate to http://127.0.0.1:4000

Uploading Files 📂

Step Instructions
Drag and Drop a File Drag and drop a CSV or JSON file into the upload area or click to select a file from your computer.
View Uploaded File Information Once uploaded, the file information such as name and size will be displayed.

Command Line

To upload a file using curl:

curl -X POST -F 'file=@/path/to/your/file.csv' http://127.0.0.1:4000/upload_file

Previewing Data 🧐

Step Instructions
Select Embedding Keys After uploading a file, select the keys (columns) you want to include in the embeddings.
Preview Document View a preview of the document created from the selected keys.
Preview Embeddings View the generated embeddings and token count for the selected document.

Command Line

To preview a file's keys using curl:

curl -X POST -H "Content-Type: application/json" -d '{"file_path":"uploads/your-file.csv"}' http://127.0.0.1:4000/preview_file

To preview a document's embeddings using curl:

curl -X POST -H "Content-Type: application/json" -d '{"file_path":"uploads/your-file.csv", "selected_keys":["key1", "key2"]}' http://127.0.0.1:4000/preview_document

Creating Vector Embeddings 🧩

Step Instructions
Start Embedding Creation Click the "Create Vector DB" button to start the embedding creation process.
View Progress Monitor the progress of the embedding creation with a circular progress indicator. 📈

Command Line

To create a vector database using curl:

curl -X POST -H "Content-Type: application/json" -d '{"file_path":"uploads/your-file.csv", "selected_keys":["key1", "key2"]}' http://127.0.0.1:4000/create_vector_database

Querying the Vector Database 🔍

Step Instructions
Enter Query Type your query into the input field.
Select Similarity Metric Choose between cosine similarity or Euclidean distance.
Submit Query Click the "Submit" button to query the vector database.
View Results Inspect the results, which display the document, score, and a button to view detailed data.

Command Line

To query the vector database using curl:

curl -X POST -H "Content-Type: application/json" -d '{"query_text":"Your query text here", "similarity_metric":"cosine"}' http://127.0.0.1:4000/query

Managing the Vector Database 🛠️

Step Instructions
Backup Database Click the "Backup Database" button to create a backup of the current vector database.
Delete Database Click the "Delete Database" button to delete the current vector database.
View Database Statistics View statistics such as total documents and average vector length.

Command Line

To check if the vector database exists using curl:

curl -X GET http://127.0.0.1:4000/check_vector_db

To view database statistics using curl:

curl -X GET http://127.0.0.1:4000/db_stats

To backup the database using curl:

curl -X POST http://127.0.0.1:4000/backup_db

To delete the database using curl:

curl -X POST http://127.0.0.1:4000/delete_db

UI Walkthrough 🎨

Feature Description
Uploading Files
  • Drag and drop a file into the upload area or click to select a file.
  • File information will be displayed after a successful upload.
Previewing Data
  • Select the keys you want to include in the embeddings.
  • View a preview of the document and generated embeddings.
Creating Vector Embeddings
  • Click the "Create Vector DB" button to start the embedding creation.
  • Monitor the progress with the circular progress indicator.
Querying the Vector Database
  • Enter your query text and select a similarity metric.
  • Click "Submit" to query the database and view the results.
Managing the Vector Database
  • Backup the database by clicking "Backup Database".
  • Delete the database by clicking "Delete Database".
  • View database statistics such as total documents and average vector length.