Skip to content
This repository has been archived by the owner on Dec 31, 2023. It is now read-only.

This is a basic API server for OCRmyPDF (using FastAPI framework).

License

Notifications You must be signed in to change notification settings

exotic-matter-sas/api-ocrmypdf

Repository files navigation

OCRmyPDF - API (using FastAPI)

API server for OCRmyPDF

OCRmyPDF is a command line tool for applying Optical Character Recognition on PDF documents. This project aims to provide a basic API to use OCRmyPDF as a server.

This was mainly done for use with Paper Matter, an easy to use document management system.

Fair warning: this is work in progress project.

Usage (Docker)

https://hub.docker.com/r/exoticmatter/api-ocrmypdf

docker run -p "8000:8000" exoticmatter/api-ocrmypdf

then open your browser to http://127.0.0.1:8000/docs

Settings

There are some settings which can be configured via environment variable. The list is available in settings.py. Simply apply upper case to the name of the setting.

Development

  • Python 3.7+
  • OCRmyPDF 12.+ Refer to OCRmyPDF docs for instructions.
pip install -r requirements.txt
pip install -r requirements_dev.txt
uvicorn main:app --reload

On Windows

Development on Windows (excluding OCRmyPDF itself) is possible if you use WSL, a compatible Linux distro such as Ubuntu (available on the Windows Store) and install OCRmyPDF inside WSL. You will need to set some settings for api-ocrmypdf to make it work.

Environment variables to set before starting server

  • BASE_COMMAND_OCR = wsl [path to ocrmypdf bin inside WSL]
  • WORKDIR = [absolute directory to the work directory]
  • ENABLE_WSL_COMPAT = 1
Example
BASE_COMMAND_OCR=wsl /home/mywinuser/.local/bin/ocrmypdf
WORKDIR=C:\dev\api-ocrmypdf\workdir;
ENABLE_WSL_COMPAT=1

Tests

python -m pytest

License

MIT (be aware, OCRmyPDF has its own licenses due to multiple dependencies)

About

This is a basic API server for OCRmyPDF (using FastAPI framework).

Resources

License

Stars

Watchers

Forks

Packages

No packages published