PDF Liberation
- 2 followers
- Washington, DC; San Francisco; New York; Chicago
- http://pdfliberation.github.io/
Popular repositories Loading
-
whatwordwhere
whatwordwhere PublicForked from jsfenfen/whatwordwhere
Tooling to extract data from scanned paper forms OCR-ed by Tesseract using the HOCR standard.
-
-
pdf_table_extraction
pdf_table_extraction Publicexperimenting with pdf2text and python pdf-table-extract
-
Jersey-City-Budget-PDF-Liberation
Jersey-City-Budget-PDF-Liberation PublicThis project will liberate data from pdf files found on http://www.cityofjerseycity.com/pub-info.aspx?id=2430 and will create .csv and .json files to be uploaded on https://data.openjerseycity.org/…
-
financial_disclosure_scraping
financial_disclosure_scraping Public(DC team) experimenting with available options for extracting info from PFDs
Repositories
- whatwordwhere Public Forked from jsfenfen/whatwordwhere
Tooling to extract data from scanned paper forms OCR-ed by Tesseract using the HOCR standard.
pdfliberation/whatwordwhere’s past year of commit activity - OCRToolkit Public Forked from opensecrets/OCRToolkit
Tools for working with Optical Character Recognition output
pdfliberation/OCRToolkit’s past year of commit activity - Jersey-City-Budget-PDF-Liberation Public
This project will liberate data from pdf files found on http://www.cityofjerseycity.com/pub-info.aspx?id=2430 and will create .csv and .json files to be uploaded on https://data.openjerseycity.org/dataset/jersey-city-2013-budget-adopted-spending
pdfliberation/Jersey-City-Budget-PDF-Liberation’s past year of commit activity - NYCEDCprosedatascraper Public
This uses regular expressions (in php, but can be any language) get data from the NYC EDC newsletters
pdfliberation/NYCEDCprosedatascraper’s past year of commit activity