Python script and documentation adapted from yaml-generator-for-hathitrust created by ruthtillman
This repository includes a Python script which uses an input CSV file to generate a YAML metadata file or files for uploading material to the HathiTrust digital library.
The purpose of adapting the Python script and documentation from the GitHub repository referenced above was to update and customize them for use at the University of Washington Libraries Preservation Services unit.
The overall workflow is as follows, and is detailed in the how-to file provided in this repository:
- Python 3.x must be installed on the computer that will be used to generate YAML files, and the csv-to-yml.py script will need to be downloaded as well.
- Create a copy of the data-entry spreadsheet template and enter information about the digitized item, the digital capture process, etc.
- Save the completed spreadsheet containing information about one or more digitized items as a CSV file.
- Run the csv-to-yml.py script
- When prompted, enter the filepath to the saved CSV file.
- When prompted, enter the filepath where generated YAML files should be saved.
- Confirm generated files and package along with page image files, OCR files, etc. for upload to the HathiTrust.