This project is a chatbot powered by the GPT 3.5 from OpenAI. It allows users to upload PDF files and ask questions related to the content of the PDFs. The chatbot uses vector storage and retrieval techniques to search for relevant information within the uploaded documents and generate accurate responses.
demo2.mp4
- LangChain for text processing, document loading, and building the question-answering chain.
- GPT 3.5 for the large language model.
- OpenAI Embeddings for text embeddings.
- Chroma as the embedding database.
- Streamlit for the user interface.
- Clone the repository:
git clone https://github.com/mosheragomaa/Document_QA_LangChain_GPT.git
cd Document_QA_LangChain_GPT
- Install the required dependencies:
pip install -r requirements.txt
Note
To run this project, you will need to create an OpenAI API key, and add it to the code files as follows:
- Open streamlit.py: and replace api_key value with your API in the following code as follows:
if "llm_model" not in st.session_state: st.session_state["llm_model"] = ChatOpenAI(model="gpt-3.5-turbo-0125", api_key= "YOUR_API_KEY")
- Open helpers.py: and replace api_key value with your API as follows:
llm = ChatOpenAI(model="gpt-3.5-turbo-0125", api_key= "YOUR_API_KEY")
-
Run the Streamlit app:
streamlit run streamlit.py
-
Upload one or more PDF files using the file uploader.
-
Once the PDFs are uploaded, you can start asking questions about their content in the chat interface.
The chatbot will generate responses based on the relevant information found in the uploaded PDFs.
-
streamlit.py: The main Streamlit app file that handles user interaction and file uploads.
-
helpers.py: Contains helper functions for loading PDFs, splitting text, creating vector stores, and building the question-answering chain.
-
requirements.txt: List of required Python packages.
Contributors are welcome to add:
- Chat history feature.
- Summarization feature.
- Feature to provide document resources.