This Streamlit App uses Retrieval-Augmented Generation (RAG) combined with the Gemini Large Language Model (LLM) and MongoDB, a database that allows for vector storage and search. The application enables users to upload PDF files 📂, ask questions related to the content of these files ❓, and receive AI-generated answers based on the uploaded content 📚.
- Overview
- Table of Contents
- System Architecture
- How It Works and Demo
- Prepare
- Project Structure
- Deployment Steps
- Host Streamlit App
- Future Development Directions
- Contact
The diagram below illustrates the data flow through the system:
- INFORMATION EXTRACTION: I use LangChain to split the data into smaller chunks with
chunk_size=512
andchunk_overlap=64
—these parameters can be adjusted. Then, I store the content of each chunk in thecontent
column of a table and save it in acollection
in MongoDB. - VECTORIZATION: Here, I use the Gemini API to host the application for free on Streamlit. If you have the resources, you can use models on Hugging Face or others.
- RELEVANT DOCUMENTS RETRIEVAL: After embedding the chunks from the
content
column, I store them in the correspondingembedding
column and create a search index using vector search for this column. Through vector search, I compare the similarity between theuser_query
and the data chunks from the PDF. - LLM QUERYING: The prompt is enriched by combining
user_query + relevant_documents + history_conversation
. You can customize the number of relevant documents returned and adjust the length of the previous conversation history included in the prompt. Then, I feed this into Gemini’s LLM model, though you can use other models. - STREAMLIT: The application's interface is built with Streamlit.
- Note 💡: This can also be applied to data sources in table format, without needing to process PDF files—you can customize the columns you want to embed
src/load_parquet.py
.
- You can use my application here: LLM-RAG
- Note 💡: You must delete the uploaded file before asking questions
The Streamlit LLM-RAG application interface is as follows:
- Upload PDF Document 📂: Upload the PDF file containing the data you want to enrich the model with.
- Choose a page 🔍: You can select from several pre-installed models:
- AI-Therapist: A psychological counseling chatbot trained on the mental-health-dataset.
- Vision Mamba: A chatbot that provides information related to Mamba and Vision Mamba.
- Chat with your Custom Data 💡: This is where you can submit your questions and receive answers based on the information you’ve added.
The main directories of the project are organized as follows:
llm_rag/
|--- .devcontainer/
|--- devcontainer.json # Configuration file for the development environment
|--- data/ # Data for the Chatbot to learn
|--- image/ # Project image directory
|--- src/
|--- app.py # Code for the Chat with Your Custom Data application
|--- load_parquet.py # Code for processing .parquet data into the database and embedding
|--- app.py # Code for processing PDF data embedding and uploading to the database
|--- streamlit_app_mamba.py # Code for the Q&A about Mamba application
|--- streamlit_app_therapist.py # Code for the Chat with AI-Therapist application
|--- .env.example # Sample environment variable file
|--- README.md # This file
|--- requirements.txt # Libraries required for the project
- Python 3.9 or later
- Streamlit
- MongoDB
- Sentence Transformer (If not using the Gemini API)
- Google Generative AI
- Langchain
To deploy the project on your computer, follow these steps:
- Visit MongoDB Atlas
- Create an account, create a project, create a database, and create a collection to store your data
- Create a column in the collection that will contain the
vector embedding
- Create an
index
for that column
- Obtain the MongoDB URI for the database you just created Instructions
Create a .env
file in your project with the following content:
GOOGLE_API_KEY = <Your Gemini API Key>
MONGODB_URI = <Your MongoDB URI>
EMBEDDING_MODEL = <Path to the Hugging Face embedding model> # If not using the Gemini Embedding Model
DB_NAME = <Your Database Name>
DB_COLLECTION = <Your Database Collection Name>
- Open the
terminal
and ensure you are in the project directory - Set up your virtual environment using
venv
orconda
:# Using venv python -m venv env_llm_rag source env_llm_rag/bin/activate # Using conda conda create --name env_llm_rag conda activate env_llm_rag
- Install the required libraries:
pip install -r requirements.txt
There are two types of data corresponding to two files:
- If your data is raw, in PDF format, use the code
src/load_pdf.py
- If your data is already in table format, use the code
src/load_parquet.py
and customize the columns you want to embed - If you want to upload data from the UI, you can skip this step
To run the file using Streamlit:
streamlit run <file_path>.py
- Refer to the code
src/streamlit_app_mamba.py
if your data processing is complete - Refer to the code
src/app.py
if you want to process PDF files uploaded from the UI The Streamlit application will be deployed athttp://localhost:8501
after running the above command
In the src/app.py
code, you need to adjust the vector_search
function to match the index
you created in the database and any related parameters.
Hosting a Streamlit app for free:
- Make sure the repository includes a
requirements.txt
file and a.py
file.
- Create a Streamlit account and link it to your GitHub account.
- Click on
Create App
. - Fill in the corresponding fields:
- Select Advanced Settings and add your environment variables here:
Deploy, and you have successfully hosted your Streamlit App. You can use the app via a link like <your-domain>.streamlit.app
.
Since this is a free plan, the resources provided by Streamlit are limited, so it is advisable to use an embedding model with an API Key.
The project plans to add a sign language recognition feature using AI and Computer Vision to capture real-time video from users. The system will recognize sign language, translate it into complete sentences, and input it into the chatbot system without the need for the user to type.
A demo of the sign language recognition feature can be found in the sign_language_translation
folder.
The project plans to add a speech recognition feature that will translate spoken words into complete sentences and input them into the chatbot system.
Currently, this feature has not yet been developed.