Skip to content
Change the repository type filter

All

    Repositories list

    • knowledge

      Public
      A place to collect and share knowledge about liberating data from PDFs
      Shell
      The Unlicense
      75311Updated Jan 30, 2022Jan 30, 2022
    • Python tool for converting hOCR files to geographic file formats
      Python
      BSD 3-Clause "New" or "Revised" License
      0430Updated Aug 14, 2014Aug 14, 2014
    • USAID-DEC

      Public
      Data from the United States Agency for International Development (USAID) Development Experience Clearinghouse (DEC).
      6100Updated Apr 7, 2014Apr 7, 2014
    • package to convert pdftotext bbox xhtml output to geojson
      MIT License
      0130Updated Feb 23, 2014Feb 23, 2014
    • Tooling to extract data from scanned paper forms OCR-ed by Tesseract using the HOCR standard.
      Python
      162200Updated Feb 23, 2014Feb 23, 2014
    • Tools for working with Optical Character Recognition output
      Python
      BSD 3-Clause "New" or "Revised" License
      6400Updated Feb 17, 2014Feb 17, 2014
    • Amnesty International Torture data
      Java
      0300Updated Feb 9, 2014Feb 9, 2014
    • This project will liberate data from pdf files found on http://www.cityofjerseycity.com/pub-info.aspx?id=2430 and will create .csv and .json files to be uploaded on https://data.openjerseycity.org/dataset/jersey-city-2013-budget-adopted-spending
      Python
      1600Updated Jan 25, 2014Jan 25, 2014
    • Homepage for this organization
      CSS
      Other
      0200Updated Jan 24, 2014Jan 24, 2014
    • This uses regular expressions (in php, but can be any language) get data from the NYC EDC newsletters
      PHP
      MIT License
      0100Updated Jan 22, 2014Jan 22, 2014
    • (DC team) experimenting with available options for extracting info from PFDs
      Python
      2400Updated Jan 20, 2014Jan 20, 2014
    • housedisc

      Public
      Java
      3001Updated Jan 20, 2014Jan 20, 2014
    • PDF liberation Hackaton - http://pdfliberation.wordpress.com/
      Python
      Apache License 2.0
      2100Updated Jan 20, 2014Jan 20, 2014
    • R
      3200Updated Jan 20, 2014Jan 20, 2014
    • Resources related to PDF Liberation hackathon
      111210Updated Jan 19, 2014Jan 19, 2014
    • 0000Updated Jan 19, 2014Jan 19, 2014
    • experimenting with pdf2text and python pdf-table-extract
      JavaScript
      MIT License
      31100Updated Jan 19, 2014Jan 19, 2014
    • Crime Statistics for the State of Utah
      The Unlicense
      2000Updated Jan 19, 2014Jan 19, 2014
    • displaying various pdf liberation tools, at PDF Liberation Hackathon
      GNU General Public License v3.0
      1000Updated Jan 18, 2014Jan 18, 2014
    • assembly

      Public
      A forum of sorts. Where we gather to discuss Issues.
      0130Updated Jan 17, 2014Jan 17, 2014