This project is a Local Retrieval Augmented Generation (RAG) pipeline built from scratch without using frameworks such as Langchain. The pipeline is connected to a local LLM and is deployed as a chatbot via Gradio. The source material is "Human Nutrition: 2020 Edition"
- Embedding Model: all-mpnet-base-v2
- LLM Model: Gemma instruction-tuned (the specific type will be automatically selected depending on hardware capabilities)
- PyMuPDF==1.23.26
- matplotlib==3.8.3
- numpy==1.26.4
- pandas==2.2.1
- Requests==2.31.0
- sentence_transformers==2.5.1
- spacy
- tqdm==4.66.2
- transformers==4.38.2
- accelerate
- bitsandbytes
- jupyter
- wheel
- gradio
- huggingface-hub
type this command in terminal/cmd/conda prompt:
conda env create -f environment.yml
pip install -r requirements.txt
python main.py
or python app.py
Where app.py contains the gradio deployment for this project and main.py runs the project through user input the terminal.
- Upload
run_on_colab.ipynb
to google colab - Clone this repo to google drive
- open
run_on_colab.ipynb
- adjust the path in google colab accordingly
- run the cell blocks
This projects requires a CUDA-compatible GPU to run
- Enable the chatbot to respond using query history as well
- Improve text preprocessing to get better RAG performance
- Integrate a re-ranker model to get better RAG results
- Improve prompt
- Many thanks to Daniel Bourke for the video guidance on this project
- Many thanks to the University of Hawai‘i at Mānoa Food Science and Human Nutrition Program for the open source textbook "Human Nutrition: 2020 Edition" which was used as the source material for this project