Automated workflow for harvesting, transforming and indexing of metadata using metha, OpenRefine and Solr. Part of the FID Romanistik software stack.
See upstream git repo HOS-MetadataTransformations for use case, features and reuse.
tested with Ubuntu 16.04 LTS and Ubuntu 18.04 LTS
install git:
sudo apt install git
clone this git repository:
git clone https://github.com/subhh/FID-Romanistik-MetadataTransformations.git
cd FID-Romanistik-MetadataTransformations
install openjdk-8-jre-headless, curl, jq, metha 1.29, OpenRefine 2.8, openrefine-client 0.3.4 and Solr 7.3.1:
sudo ./install.sh
Configure Solr schema:
./init-solr-schema.sh
Data will be available after first run at:
- Solr admin: http://localhost:8983/solr/#/fid
- Solr browse: http://localhost:8983/solr/fid/browse
- OpenRefine: http://localhost:3333
Run workflow with data source "dialnet-tesis" and load data into local Solr (-s) and local OpenRefine service (-d)
bin/dialnet-tesis.sh -s http://localhost:8983/solr/fid -d http://localhost:3333
Run workflow with all data sources in parallel and load data into local Solr (-s) and local OpenRefine service (-d):
./run.sh -s http://localhost:8983/solr/fid -d http://localhost:3333
Run workflow with all data sources and load data into external Solr core
./run.sh -s "http://..."