You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A streamlit app that enables users to interact with the uploaded PDF. You can ask questions or doubts regarding the PDF and our Chatbot would answer them with a friendly response.
Tech stack
🐍Python
🛑🔥Streamlit
🦜️🔗Langchain
🔰Weaviate
❇️OpenAI
🆚Git & Github
🤗Hugging Face (used for testing purpose)
🥭MongoDB (used for testing purpose)
Demo App
Working
Let's breakdown the working of the app into chunks to make it easier to understand:
Upload the PDF
Extract the text from the PDF file
Generate embeddings of the text
Store the embeddings in the vectorstore
Retrieve the closest match
Display the results in a Chatbot (Interface)
Upload the PDF
It has to be a file with .pdf extension and it must be within 15 MB for time being.
Then this file will be used for further processing.
Extract the text from the PDF file
- We need to extract the text from the PDF for which we use [PyPDF2](https://pypdf2.readthedocs.io/en/3.0.0/) library and does its part really well and quick.
Generate embeddings of the text
- We are then using generated text and to split the text into small chunks and create documents and are fed as input into the OpenAI Embedding library.
Store the embeddings in the vectorstore
- We are storing the embeddings into the Weaviate vectorstore where we have a certain schema to maintain modularity and all the embeddings are stored there.
Retrieve the closest match
- We then run the Weaviate hybrid search on the schema, using Langchain and OpenAI that will return the closest match
Display the results in a Chatbot (Interface)
- Finally we display the results as a chat like interface provided by Streamlit