Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
-
Updated
Dec 20, 2024 - HTML
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
A Repo For Document AI
ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.
Detectron2 for Document Layout Analysis
[Late Submission] Solution for Kuzushiji recognition (Kaggle competition)
Visual Domain Knowledge-based Multimodal Zoning Textual Region Localization in Noisy Historical Document Images
Extracting structured text from GI Bill index cards for JDoc 2023 paper
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Analyze document image complexity based on segmentation results
Matrix Representation reformats images as RDF using natural ⨯ natural coordinates as a Media-Signature-Record / Structured-Data-Description. It is a positive, productive, and pragmatic introduction to semantic-web programming.
Add a description, image, and links to the document-image-analysis topic page so that developers can more easily learn about it.
To associate your repository with the document-image-analysis topic, visit your repo's landing page and select "manage topics."