AI Assisted PDF Reader

This is a small Python utility that empowers users to read, summarize, and ask questions about PDF documents using Open AI Apis.

Here is a short video demonstrating loading, batch_summarizing, vectorizing, and asking questions about a PDF document.

Usage Guide

To use this tool you must have an Open AI Api key, as that is the model used for both text generation and embeddings.

The environment variable OPENAI_API_KEY must be set to your API key for this tool to work.

Clone down the repository locally.
(Optional, but recommended) Setup a virtual python environment. virtualenv venv && source venv/bin/activate
Install the Python requirements with pip install -r requirements.txt.
Drop a PDF you want to read in the same directory. (Can be anywhere on your filesystem, but local is easier)
Run the tool python doc_assist.py

After these steps you will have a prompt called doc assist: you can use to interact with your documents.

Tool Commands

load

doc assist: load <path_to_pdf_file>

This loads the PDF file into memory and is the required first step.

read

doc assist: read <page_num>

This reads the text of a given page of the PDF document, and is useful to quickly inspect pages on the fly.

summarize

doc assist: summarize <page_num>

Generates a short summary for a specific page.

batch_summarize

doc assist: batch_summarize <start_page> <end_page>

start_page and end_page are optional. If not specified the entire document is used.

This uses a MapReduce system to iteratively generate summaries of each page, and then generate holistic summaries from there.

vectorize

doc assist: vectorize

This creates a ChromaDB collection for the PDF document, creates embeddings for the document pages, and persists the content for later use.

ask

doc assist: ask <query>

Ask questions about the content of the PDF document. vectorize must be run first.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
doc_assist		doc_assist
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
doc_assist.py		doc_assist.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Assisted PDF Reader

Usage Guide

Tool Commands

load

read

summarize

batch_summarize

vectorize

ask

About

Releases

Packages

Languages

License

GallagherSam/pdf-doc-assist

Folders and files

Latest commit

History

Repository files navigation

AI Assisted PDF Reader

Usage Guide

Tool Commands

load

read

summarize

batch_summarize

vectorize

ask

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages