Image-text OCR

This is a project to use Vision-text models to recognize the text in pdf files. Currently, only GOT_OCR2.0 is supported. The goal of this project is to extract the text from the images and then convert it into a formatted text file (LaTeX, HTML, Markdown).

Requirements

Python 3.10+
Pytorch 2.4+

Usage

Create a python virtual environment: python -m venv venv
Install the required packages: pip install -r requirements.txt
Start the server: python server.py
Process a file: python main.py docs/test1.pdf

The converted files will be saved in the docs directory.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.vscode		.vscode
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
conversion_utils.py		conversion_utils.py
file_processor.py		file_processor.py
got_ocr.py		got_ocr.py
image_processor.py		image_processor.py
main.py		main.py
requirements.txt		requirements.txt
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image-text OCR

Requirements

Usage

About

Releases

Packages

Languages

License

scholarsportal/image-text-ocr

Folders and files

Latest commit

History

Repository files navigation

Image-text OCR

Requirements

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages