This repository contains the functionality to create and standardize the Global Register of Introduced and Invasive Species - Belgium to a Darwin Core checklist that can be harvested by GBIF.
This unified checklist is the result of an open and reproducible data publication and data processing pipeline developed for the TrIAS project. The data publication pipeline is based on the Checklist recipe and consists of the publication of a selection of authoritative (inter)national checklists as standardized Darwin Core Archives to GBIF. These are:
- Verloove et al. (2018) based on Verloove (2018) for plants
- Boets et al. (2018) based on Boets et al. (2016) for macroinvertebrates
- Verreycken et al. (2018a) based on Verreycken et al. (2018b) for fishes
- Vanderweyen et al. (2018) based on Vanderweyen & Fraiture (2007, 2008, 2011) for rust fungi
- Reyserhove et al. (2018) for various species
- Zieritz et al. (2018) based on Zieritz et al. (2017) for pathways.
Predominantly, these checklists record the presence of alien species in Belgium for a specific taxon group or habitat and are maintained by their respective authors. The data processing consists of the extraction of all Belgian non-native taxa from these checklists and the unification of their taxonomy (using the GBIF Backbone Taxonomy) and related information. This automated process is implemented and documented at https://trias-project.github.io/unified-checklist/
See https://trias-project.github.io/unified-checklist/
The repository structure is based on Cookiecutter Data Science and the Checklist recipe. Files and directories indicated with GENERATED
should not be edited manually.
βββ README.md : Description of this repository
βββ LICENSE : Repository license
βββ unified-checklist.Rproj : RStudio project file
βββ .gitignore : Files and directories to be ignored by git
β
βββ data
β βββ raw : Source data as downloaded from GBIF GENERATED
β βββ interim : Unified data GENERATED
β βββ processed : Darwin Core output of mapping script GENERATED
β
βββ references
β βββ verification.tsv : Verification file (for synonyms). Generated by
β 3_verify_taxa.Rmd and then manually annotated
β
βββ docs : Repository website GENERATED
β
βββ index.Rmd : Website homepage
βββ _bookdown.yml : Settings to build website in docs/
β
βββ src
βββ 1_get_taxa.Rmd : Script to get taxa from checklists
βββ 2_get_information.Rmd : Script to get related information
βββ 3_verify_taxa.Rmd : Script to verify taxa
βββ 4_unify_taxa.Rmd : Script to unify taxa
βββ 5_unify_information.Rmd : Script to unify related information
βββ 6_dwc_mapping.Rmd : Script to map to Darwin Core
βββ 7_griis_mapping.Rmd : Script to map to create Excel file for GRIIS
- Clone this repository to your computer
- Open the RStudio project file
- Open the
index.Rmd
R Markdown file in RStudio - Install any required packages
- Click
Build > Build Book
to generate the processed data and build the website indocs/
To publish an update of the dataset:
- Open the resource in the IPT (login required)
Source data
: upload the newly generated data files fromdata/processed
Darwin Core mappings
: does not require updates, unless terms were added/removed in the pipelineMetadata
: does not require updates, except for:Basic metadata
: in description, check if number of taxa (2.500+) still appliesTaxonomic coverage
: in description, update numbers per kingdom based on new dataTemporal coverage
: updateEnd date
if need be
- Publish: click
Publish
, add a short description and publish - Check if dataset is updated at GBIF (can take a couple of hours)
MIT License for the code and documentation in this repository. The included data is released under another license.