This repository has been archived by the owner on Feb 2, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 3
Augmenter Importer specification
Rémy Greinhofer edited this page Apr 19, 2019
·
5 revisions
The goal of the augmenters is to automatically augment the raw data sets with extra information.
An example is the geocoding augmenter which adds coordinates to a fatality entry.
An augmenter reads a data set, uses the content to perform operations or to query services, and returns the augmented data.
- Augmenters must have the option to be piped together:
cat fatalities-2019-raw.json | scrapd-augmenter-geocoding-geocensus | scrapd-augmenter-geocoding-tamu
- Augmenters must have the option to update the data in place:
python scrapd-augmenter-geocoding-geocensus.py -i fatalities-all-augmented.json
- Augmenters must be able to read from
stdin
. - Augmenters must be written in Python or Go.
- Augmenters should not have any external dependency other than what ScrAPD uses (if written in Python).
- Augmenters must have unit and integration tests.
- A JSON file representing the data set.
- The internal format is a list of objects.
- on-screen or in-place
- The general format is:
scrapd-{type}-{operation_or_datatype}-{service}
.-
type
: tool type (augmenter
orimporter
) -
operation_or_datatype
: the type of operation performed by the augmenter or the type of data added to the data set -
service
: the name of the service used to perform the operation or retrieve the data
-
- All the components of the name must be in lower case
- scrapd-augmenter-geocoding-geocensus
- scrapd-augmenter-geocoding-tamu
- scrapd-importer-dataset-apd