Skip to content

Commit

Permalink
Merge pull request #61 from AgentschapPlantentuinMeise/refactorreadme
Browse files Browse the repository at this point in the history
Refactorreadme
  • Loading branch information
LynnDelgat authored Aug 12, 2022
2 parents ea799c2 + 2088c1c commit 6f97cc0
Show file tree
Hide file tree
Showing 7 changed files with 113 additions and 97 deletions.
106 changes: 10 additions & 96 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,114 +2,28 @@

Repository for the MIDSCalculator app, written mostly by @LynnDelgat during an internship at Meise Botanic Garden.


## Minimum Information about a Digital Specimen

Minimum Information about a Digital Specimen (MIDS) is a data standard which aims to describe four different levels of digitization status for biological and geological specimens from natural history collections, each level requiring certain information elements (referred to as MIDS elements from here on) to be digitally available. With this standard, users of collection data, as well as funders and collection curators will have a better overview of how comprehensively digitized a collection or any other set of specimens really is. The standard is currently still under development by a Task Group of the global Biodiversity Information Standards (“TDWG”) organization. While MIDS elements for digitization levels 0 and 1 have been specified, MIDS elements for levels 2 and 3 are still under discussion.
For more details: https://github.com/tdwg/mids


## MIDSCalculator App
### Introduction
The MIDSCalculator is a Shiny app which allows users to calculate MIDS scores for each record in a submitted dataset, and explore the results.

#### Mapping MIDS elements: use of a JSON schema

As the MIDS standard is agnostic to the data model of the sourced data, the app uses a JSON schema to map MIDS elements to properties of the sourced data. A schema mapping to GBIF annotated DwC data is included in the app. Other JSON schemas can be uploaded. In addition the schema is editable through the app’s interface.

### Installation

The latest self-contained windows installer (generated using Inno Setup) can be found here: https://drive.google.com/drive/folders/1ioRhHIvdYI88yoPTsYLG_k5-8n-05CiP

### Submit data

On this page a zipped GBIF annotated Darwin Core Archive or a comma or tab separated occurrence file can be uploaded (max 5GB). In addition, the MIDS implementation can be specified and viewed. To specify the MIDS implementation you can either choose the default schema (included in the app) or upload a schema from file. It is also possible to choose to edit this schema interactively. The interactive editing opens in a pop-up window, where in a first tab, MIDS elements can be added, removed, or moved to another MIDS level. In addition, mappings can be removed or added by clicking the "edit" icon of a MIDS element. In a second tab, the Unknown or Missing section of the schema can be edited, i.e. new properties and new values can be added. This interactively edited schema can be saved to file (JSON). The schema (be it default, custom or interactive) can be viewed by clicking the eye icon, which opens a human-friendly visualization of the MIDS schema, so that it is not necessary to read the JSON file to be able to understand the specifics of the MIDS schema used. Once a dataset and a MIDS implementation have been chosen, calculations can be started by clicking "Start MIDS score calculations".

### Results

The results of each analysis are visualized on a new page, where it is possible to explore summaries of the results of both MIDS levels and MIDS elements, either as plots or as tables. The MIDS element plot can be clicked to get more details on the results of the mappings of that element. It is also possible to explore the complete records table with the MIDS results for each record, and to download it as a csv file. In addition, the data can be filtered to see how MIDS results change when filtering on properties such as country code /taxonomic group/ collection date. The filename of the dataset is shown, as well as the used MIDS implementation, to make the provenance of the calculations clear.'


The MIDSCalculator is a Shiny app which allows users to calculate MIDS scores for each record in a submitted dataset, and explore the results.

## Code
Code can be found in the /src folder.
* [How to install](/help/howtoinstall.md)
* [Using the app](/help/howtouse.md)
* [Additional info about the code](/help/codeinfo.md)
* [Create a new installer](/help/rinno_installer.md).

### Two scripts to calculate MIDS levels based on a a given JSON schema
parse_json_schema.R
* 2 functions:
* read_json_mids_criteria(file): uses the MIDS sections of the JSON schema and returns criteria per MIDS level
* read_json_unknownOrMissing(file): uses the unknownOrMissing section of the JSON schema and returns a list of these values
* no need to run this separately, is loaded in MIDS-calc.R
## Mapping MIDS elements: use of a JSON schema

MIDS-calc.R
* given a JSON schema and a dataset, this script calculates for each record which criteria are met and the MIDS level

### Shiny app
Code for the Shiny app can be found in the /src/Shiny_MIDS folder.

MIDScalcApp.R
* main code for the app
As the MIDS standard is agnostic to the data model of the sourced data, the app uses a JSON schema to map MIDS elements to properties of the sourced data. A schema mapping to GBIF annotated DwC data is included in the app. Other JSON schemas can be uploaded. In addition the schema is editable through the app’s interface.

/src/Shiny_MIDS/R folder contains 4 modules:
* CloseTabModule.R
* allows to close results tabs
* InteractiveSchemaModule.R
* allows to edit the MIDS implementation interactively
* ResultsModule.R
* calculates MIDS levels and which criteria are met
* opens a tab showing the results for each analysis
* allows to export results to csv
* ViewImplementationModule.R
* allows to visualize the MIDS implementation
* [Documentation on the JSON schema](/help/jsonschema.md)

## Data
### Datasets
* GBIF Occurrence Download [10.15468/dl.e8jnan](http://doi.org/10.15468/dl.e8jnan) can be found as a zip file in the data folder for quick testing.
* A sample dataset for testing, GBIF Occurrence Download [10.15468/dl.e8jnan](http://doi.org/10.15468/dl.e8jnan), can be found as a zip file in the data folder for quick testing.

### Schemas
* [DwC-GBIF_schema.json](https://github.com/AgentschapPlantentuinMeise/MIDSCalculator/blob/main/data/schemas/DwC-GBIF_schema.json): This schema is based on the most recent MIDS specification for levels 0 and 1. As levels 2 and 3 are still under discussion, the schema offers a basic interpretation of several potential properties.

## Session info
Code was run with the following packages and versions:

```
> sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)
Matrix products: default
locale:
[1] LC_COLLATE=English_Belgium.1252 LC_CTYPE=English_Belgium.1252
[3] LC_MONETARY=English_Belgium.1252 LC_NUMERIC=C
[5] LC_TIME=English_Belgium.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] shinybusy_0.3.1 RColorBrewer_1.1-3 jsonlite_1.8.0 magrittr_2.0.1 purrr_0.3.4
[6] data.table_1.14.2 dplyr_1.0.8 sortable_0.4.5 shinyjs_2.1.0 DT_0.23
[11] ggplot2_3.3.5 shinyBS_0.61.1 shiny_1.7.2
loaded via a namespace (and not attached):
[1] Rcpp_1.0.8.3 assertthat_0.2.1 rprojroot_2.0.3 digest_0.6.29 utf8_1.2.2
[6] mime_0.12 R6_2.5.1 evaluate_0.15 pillar_1.6.4 rlang_1.0.2
[11] rstudioapi_0.13 fontawesome_0.3.0 jquerylib_0.1.4 learnr_0.10.1 rmarkdown_2.14
[16] labeling_0.4.2 stringr_1.4.0 htmlwidgets_1.5.4 munsell_0.5.0 compiler_4.0.3
[21] httpuv_1.6.5 xfun_0.29 pkgconfig_2.0.3 htmltools_0.5.2 sourcetools_0.1.7
[26] tidyselect_1.1.2 tibble_3.1.6 fansi_1.0.3 crayon_1.5.1 withr_2.5.0
[31] later_1.3.0 grid_4.0.3 xtable_1.8-4 gtable_0.3.0 lifecycle_1.0.1
[36] DBI_1.1.3 scales_1.2.0 cli_3.2.0 stringi_1.7.6 cachem_1.0.6
[41] farver_2.1.0 promises_1.2.0.1 bslib_0.4.0 ellipsis_0.3.2 generics_0.1.3
[46] vctrs_0.4.1 tools_4.0.3 glue_1.6.0 markdown_1.1 crosstalk_1.2.0
[51] rsconnect_0.8.27 fastmap_1.1.0 yaml_2.2.1 colorspace_2.0-3 memoise_2.0.1
[56] knitr_1.39 sass_0.4.1
```

## Installer creation using RInno
* Follow instructions on https://github.com/ficonsulting/RInno
* If Windows version not supported (64bit), install with installr following instructions here https://github.com/ficonsulting/RInno/issues/118#issuecomment-460094226
* To create the installer, run rinno_installer.R.
* For R versions > 3, you have to edit two functions following https://github.com/ficonsulting/RInno/issues/152#issuecomment-681009752 or install the fork from github: brandonerose/RInno
* [DwC-GBIF_schema.json](/data/schemas/DwC-GBIF_schema.json): This schema is used as the default by the app and is based on the most recent MIDS specification for levels 0 and 1. As levels 2 and 3 are still under discussion, the schema offers a basic interpretation of several potential properties.
38 changes: 38 additions & 0 deletions help/codeinfo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Code
Different scripts can be found in the /src folder.

### Two scripts to calculate MIDS levels based on a a given JSON schema
`parse_json_schema.R`
* 2 functions:
* read_json_mids_criteria(file): uses the MIDS sections of the JSON schema and returns criteria per MIDS level
* read_json_unknownOrMissing(file): uses the unknownOrMissing section of the JSON schema and returns a list of these values
* no need to run this separately, is loaded in MIDS-calc.R

`MIDS-calc.R`
* given a JSON schema and a dataset, this script calculates for each record which criteria are met and the MIDS level

### Package checker
`packages.R`
* required packages (excluding imported packages) for the app. These will be auto-installed when running the app if not present and are also used to build the installer.

### Installer generator
`rinno_installer.R`
* this script is not used by the app. It is run separately to generate an installer using RInno setup. [More info](/help/rinno_installer.md).

### Shiny app
Code for the Shiny app can be found in the /src/Shiny_MIDS folder.

`MIDScalcApp.R`
* main code for the app

/src/Shiny_MIDS/R folder contains 4 modules:
* `CloseTabModule.R`
* allows to close results tabs
* `InteractiveSchemaModule.R`
* allows to edit the MIDS implementation interactively
* `ResultsModule.R`
* calculates MIDS levels and which criteria are met
* opens a tab showing the results for each analysis
* allows to export results to csv
* `ViewImplementationModule.R`
* allows to visualize the MIDS implementation
6 changes: 6 additions & 0 deletions help/howtoinstall.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Installation
The latest self-contained windows installer (generated using Inno Setup) can be found here: https://drive.google.com/drive/folders/1ioRhHIvdYI88yoPTsYLG_k5-8n-05CiP Note that this installer will install (a more recent) version of R if it is not available on your system already. The app currently works with R 4.2.0. It may function with older versions of R, but this cannot be guaranteed.

Alternatively, you can download (or clone) the contents of this repository and run the code yourself using your local R instance. Running the `app.R` file should launch the app in your browser. Required R packages should automatically be installed if not present already during the first launch of the app.

Due to the large memory requirements this app may pose, calculating scores for millions of specimens, a web-hosted version of this app is currently not available.
12 changes: 12 additions & 0 deletions help/howtouse.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Using the app

### Submit data

On the app's initial interface, a zipped GBIF annotated Darwin Core Archive or a comma or tab separated occurrence file can be uploaded (max 5GB). During local use of the app, this simply means the dataset will be loaded into your local memory.

Also on the interface, the MIDS implementation to be used can be viewed, as specified by the JSON schema. The schema lists all MIDS elements and how they are mapped to properties in the uploaded dataset. You can either choose the default schema included in the app or upload a custom schema from file. It is also possible to choose to edit this schema interactively in the app. The interactive editing opens in a pop-up window, where in a first tab, MIDS elements can be added, removed, or moved to another MIDS level. In addition, mappings can be removed or added by clicking the "edit" icon of a MIDS element. In a second tab, the Unknown or Missing section of the schema can be edited, i.e. new properties and new values can be added. This interactively edited schema can be saved to file (JSON). The schema (be it default, custom or interactive) can be viewed by clicking the eye icon, which opens a human-friendly visualization of the MIDS schema, so that it is not necessary to read the JSON file to be able to understand the specifics of the MIDS schema used. Once a dataset and a MIDS implementation have been chosen, calculations can be started by clicking "Start MIDS score calculations".

### Results

The results of each analysis are visualized on a new page, where it is possible to explore summaries of the results of both MIDS levels and MIDS elements, either as plots or as tables. The MIDS element plot can be clicked to get more details on the results of the mappings of that element. It is also possible to explore the complete records table with the MIDS results for each record, and to download it as a csv file. In addition, the data can be filtered to see how MIDS results change when filtering on properties such as country code /taxonomic group/ collection date. The filename of the dataset is shown, as well as the used MIDS implementation, to make the provenance of the calculations clear.'

42 changes: 42 additions & 0 deletions help/jsonschema.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Structure of the JSON schema

### Metadata:

* `schemaName`: A label for the schema, no use outside of human readability.
* `schemaVersion`: Simple version tracker for when a schema receives an update.
* `date`: Date (and optionally time) for when this schema version was created. More reliable versioning property than schemaVersion.
* `schemaType`: Should always be `MIDScalculator`.

### unknownOrMissing

This section lists values for properties that are to be understood as absence, rather than presence, of data. That is, these are known values that indicate absence of data.

If no `property` is listed for a value, the value is applied to any property mapped to a MIDS element. If a property is listed, the value is only ignored for that property.

A `midsAchieved` flag is always present and typically `false`, to indicate this value is to be understood as absence. The schema parser does support negation, that is, a MIDS level that requires a certain property to be absent of data. This is implemented by specifying the `operator` as `NOT`. To ignore certain values for such a mapping, the `midsAchieved` flag would be `true`. Note that negation is currently not part of the default schema and also not supported in the interactive schema editor in the app (only by manually editing the JSON).

### MIDS levels

The four levels have their own section each. Each level should have at least one element as a condition, but can otherwise have any number of elements.

Within an element, the mappings for this element are listed. Mappings to multiple properties in the data source are possible, and requirements of `OR` or `AND` can be set to specify the logic of the mapping. The operator `NOT` is also supported, but currently not part of the schema (anymore) and it is not implemented in the interface that enables interactive schema editing. Example of a more complex mapping for a Location term (note that this particular mapping is a suggestion, not currently agreed in the latest official MIDS draft):

```
"Location": [
{
"property": [
"decimalLatitude",
"decimalLongitude"
],
"operator": "AND"
},
{
"property": [
"locality",
"county",
"verbatimLocality"
],
"operator": "OR"
}
]
```
4 changes: 4 additions & 0 deletions help/rinno_installer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Installer creation using RInno
* Follow instructions on https://github.com/ficonsulting/RInno to install the RInno package and the RInno Setup software.
* If Windows version not supported (64bit), install with installr following instructions here https://github.com/ficonsulting/RInno/issues/118#issuecomment-460094226
* To create the installer, run [rinno_installer.R.](/src/rinno_installer.R)
2 changes: 1 addition & 1 deletion infoafter.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
You have now succesfully installed the MIDSCalculator app. Running the executable file should launch the app in your Chrome browser. If you encounter any problems, you can find support at https://github.com/AgentschapPlantentuinMeise/MIDSCalculator/issues .
You have now succesfully installed the MIDSCalculator app. Running the executable file should launch the app in your browser. If you encounter any problems, you can find support at https://github.com/AgentschapPlantentuinMeise/MIDSCalculator/issues .

0 comments on commit 6f97cc0

Please sign in to comment.