doc-query is a sophisticated AI-powered document query system designed to facilitate seamless interaction with PDF documents. Users can upload PDF files and ask questions related to the content of those documents. Leveraging state-of-the-art technologies, the system provides real-time responses to user queries, ensuring an intuitive and efficient user experience.
- Upload Document: Users start by uploading a PDF document to the system. The document is stored securely in an Amazon S3 bucket.
- Indexing with OpenAI API: Upon upload, the system generates an index vector for the document using the OpenAI API. This index vector is a representation of the document's content and is crucial for accurate query responses.
- Storage in Pinecone Vector Database: The index vector is stored in the Pinecone vector database, ensuring efficient retrieval and comparison during query processing.
- User Query: Once the document is indexed, users can pose questions based on the content of the PDF. These queries are processed in real-time.
- Query Processing: To answer the user's query, the system first identifies similarities between the query and the indexed documents stored in the Pinecone database. This narrowed-down set of documents serves as the context for the subsequent step.
- AI Response Generation: The system utilizes a GPT model, taking into account the identified context (similar documents), previous conversations, and the original user query. The model generates a response tailored to the user's query and context.
- Storage and Presentation: The generated response from the AI model is stored in the database for future reference and is promptly displayed to the user. This ensures a seamless and efficient interaction flow.
- Real-time AI chatbot functionality for querying PDF documents.
- Utilization of advanced AI technologies for accurate and context-aware responses.
- Modern user interface for an enhanced user experience.
- Secure document storage using Amazon S3.
- Efficient indexing and retrieval using Pinecone vector database.
- Upload: Upload your PDF document through the provided interface.
- Query: Ask questions related to the uploaded document.
- Receive Response: Get real-time responses generated by the AI model based on the document's content and context.
We welcome any feedback or suggestions for improving the doc-query system. Feel free to open an issue on GitHub with your thoughts and ideas.
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.