diff --git a/README.md b/README.md index ff7cc56..21b05d9 100644 --- a/README.md +++ b/README.md @@ -2,114 +2,28 @@ Repository for the MIDSCalculator app, written mostly by @LynnDelgat during an internship at Meise Botanic Garden. - ## Minimum Information about a Digital Specimen Minimum Information about a Digital Specimen (MIDS) is a data standard which aims to describe four different levels of digitization status for biological and geological specimens from natural history collections, each level requiring certain information elements (referred to as MIDS elements from here on) to be digitally available. With this standard, users of collection data, as well as funders and collection curators will have a better overview of how comprehensively digitized a collection or any other set of specimens really is. The standard is currently still under development by a Task Group of the global Biodiversity Information Standards (“TDWG”) organization. While MIDS elements for digitization levels 0 and 1 have been specified, MIDS elements for levels 2 and 3 are still under discussion. For more details: https://github.com/tdwg/mids - ## MIDSCalculator App -### Introduction -The MIDSCalculator is a Shiny app which allows users to calculate MIDS scores for each record in a submitted dataset, and explore the results. - -#### Mapping MIDS elements: use of a JSON schema - -As the MIDS standard is agnostic to the data model of the sourced data, the app uses a JSON schema to map MIDS elements to properties of the sourced data. A schema mapping to GBIF annotated DwC data is included in the app. Other JSON schemas can be uploaded. In addition the schema is editable through the app’s interface. - -### Installation - -The latest self-contained windows installer (generated using Inno Setup) can be found here: https://drive.google.com/drive/folders/1ioRhHIvdYI88yoPTsYLG_k5-8n-05CiP - -### Submit data - - On this page a zipped GBIF annotated Darwin Core Archive or a comma or tab separated occurrence file can be uploaded (max 5GB). In addition, the MIDS implementation can be specified and viewed. To specify the MIDS implementation you can either choose the default schema (included in the app) or upload a schema from file. It is also possible to choose to edit this schema interactively. The interactive editing opens in a pop-up window, where in a first tab, MIDS elements can be added, removed, or moved to another MIDS level. In addition, mappings can be removed or added by clicking the "edit" icon of a MIDS element. In a second tab, the Unknown or Missing section of the schema can be edited, i.e. new properties and new values can be added. This interactively edited schema can be saved to file (JSON). The schema (be it default, custom or interactive) can be viewed by clicking the eye icon, which opens a human-friendly visualization of the MIDS schema, so that it is not necessary to read the JSON file to be able to understand the specifics of the MIDS schema used. Once a dataset and a MIDS implementation have been chosen, calculations can be started by clicking "Start MIDS score calculations". - -### Results - -The results of each analysis are visualized on a new page, where it is possible to explore summaries of the results of both MIDS levels and MIDS elements, either as plots or as tables. The MIDS element plot can be clicked to get more details on the results of the mappings of that element. It is also possible to explore the complete records table with the MIDS results for each record, and to download it as a csv file. In addition, the data can be filtered to see how MIDS results change when filtering on properties such as country code /taxonomic group/ collection date. The filename of the dataset is shown, as well as the used MIDS implementation, to make the provenance of the calculations clear.' - - +The MIDSCalculator is a Shiny app which allows users to calculate MIDS scores for each record in a submitted dataset, and explore the results. -## Code -Code can be found in the /src folder. +* [How to install](/help/howtoinstall.md) +* [Using the app](/help/howtouse.md) +* [Additional info about the code](/help/codeinfo.md) +* [Create a new installer](/help/rinno_installer.md). -### Two scripts to calculate MIDS levels based on a a given JSON schema -parse_json_schema.R -* 2 functions: - * read_json_mids_criteria(file): uses the MIDS sections of the JSON schema and returns criteria per MIDS level - * read_json_unknownOrMissing(file): uses the unknownOrMissing section of the JSON schema and returns a list of these values -* no need to run this separately, is loaded in MIDS-calc.R +## Mapping MIDS elements: use of a JSON schema -MIDS-calc.R -* given a JSON schema and a dataset, this script calculates for each record which criteria are met and the MIDS level - -### Shiny app -Code for the Shiny app can be found in the /src/Shiny_MIDS folder. - -MIDScalcApp.R -* main code for the app +As the MIDS standard is agnostic to the data model of the sourced data, the app uses a JSON schema to map MIDS elements to properties of the sourced data. A schema mapping to GBIF annotated DwC data is included in the app. Other JSON schemas can be uploaded. In addition the schema is editable through the app’s interface. -/src/Shiny_MIDS/R folder contains 4 modules: - * CloseTabModule.R - * allows to close results tabs - * InteractiveSchemaModule.R - * allows to edit the MIDS implementation interactively - * ResultsModule.R - * calculates MIDS levels and which criteria are met - * opens a tab showing the results for each analysis - * allows to export results to csv - * ViewImplementationModule.R - * allows to visualize the MIDS implementation +* [Documentation on the JSON schema](/help/jsonschema.md) ## Data ### Datasets -* GBIF Occurrence Download [10.15468/dl.e8jnan](http://doi.org/10.15468/dl.e8jnan) can be found as a zip file in the data folder for quick testing. +* A sample dataset for testing, GBIF Occurrence Download [10.15468/dl.e8jnan](http://doi.org/10.15468/dl.e8jnan), can be found as a zip file in the data folder for quick testing. ### Schemas -* [DwC-GBIF_schema.json](https://github.com/AgentschapPlantentuinMeise/MIDSCalculator/blob/main/data/schemas/DwC-GBIF_schema.json): This schema is based on the most recent MIDS specification for levels 0 and 1. As levels 2 and 3 are still under discussion, the schema offers a basic interpretation of several potential properties. - -## Session info -Code was run with the following packages and versions: - -``` -> sessionInfo() -R version 4.0.3 (2020-10-10) -Platform: x86_64-w64-mingw32/x64 (64-bit) -Running under: Windows 10 x64 (build 19044) - -Matrix products: default - -locale: -[1] LC_COLLATE=English_Belgium.1252 LC_CTYPE=English_Belgium.1252 -[3] LC_MONETARY=English_Belgium.1252 LC_NUMERIC=C -[5] LC_TIME=English_Belgium.1252 - -attached base packages: -[1] stats graphics grDevices utils datasets methods base - -other attached packages: - [1] shinybusy_0.3.1 RColorBrewer_1.1-3 jsonlite_1.8.0 magrittr_2.0.1 purrr_0.3.4 - [6] data.table_1.14.2 dplyr_1.0.8 sortable_0.4.5 shinyjs_2.1.0 DT_0.23 -[11] ggplot2_3.3.5 shinyBS_0.61.1 shiny_1.7.2 - -loaded via a namespace (and not attached): - [1] Rcpp_1.0.8.3 assertthat_0.2.1 rprojroot_2.0.3 digest_0.6.29 utf8_1.2.2 - [6] mime_0.12 R6_2.5.1 evaluate_0.15 pillar_1.6.4 rlang_1.0.2 -[11] rstudioapi_0.13 fontawesome_0.3.0 jquerylib_0.1.4 learnr_0.10.1 rmarkdown_2.14 -[16] labeling_0.4.2 stringr_1.4.0 htmlwidgets_1.5.4 munsell_0.5.0 compiler_4.0.3 -[21] httpuv_1.6.5 xfun_0.29 pkgconfig_2.0.3 htmltools_0.5.2 sourcetools_0.1.7 -[26] tidyselect_1.1.2 tibble_3.1.6 fansi_1.0.3 crayon_1.5.1 withr_2.5.0 -[31] later_1.3.0 grid_4.0.3 xtable_1.8-4 gtable_0.3.0 lifecycle_1.0.1 -[36] DBI_1.1.3 scales_1.2.0 cli_3.2.0 stringi_1.7.6 cachem_1.0.6 -[41] farver_2.1.0 promises_1.2.0.1 bslib_0.4.0 ellipsis_0.3.2 generics_0.1.3 -[46] vctrs_0.4.1 tools_4.0.3 glue_1.6.0 markdown_1.1 crosstalk_1.2.0 -[51] rsconnect_0.8.27 fastmap_1.1.0 yaml_2.2.1 colorspace_2.0-3 memoise_2.0.1 -[56] knitr_1.39 sass_0.4.1 -``` - -## Installer creation using RInno -* Follow instructions on https://github.com/ficonsulting/RInno - * If Windows version not supported (64bit), install with installr following instructions here https://github.com/ficonsulting/RInno/issues/118#issuecomment-460094226 -* To create the installer, run rinno_installer.R. - * For R versions > 3, you have to edit two functions following https://github.com/ficonsulting/RInno/issues/152#issuecomment-681009752 or install the fork from github: brandonerose/RInno \ No newline at end of file +* [DwC-GBIF_schema.json](/data/schemas/DwC-GBIF_schema.json): This schema is used as the default by the app and is based on the most recent MIDS specification for levels 0 and 1. As levels 2 and 3 are still under discussion, the schema offers a basic interpretation of several potential properties. \ No newline at end of file diff --git a/help/codeinfo.md b/help/codeinfo.md new file mode 100644 index 0000000..f4e7e1a --- /dev/null +++ b/help/codeinfo.md @@ -0,0 +1,38 @@ +# Code +Different scripts can be found in the /src folder. + +### Two scripts to calculate MIDS levels based on a a given JSON schema +`parse_json_schema.R` +* 2 functions: + * read_json_mids_criteria(file): uses the MIDS sections of the JSON schema and returns criteria per MIDS level + * read_json_unknownOrMissing(file): uses the unknownOrMissing section of the JSON schema and returns a list of these values +* no need to run this separately, is loaded in MIDS-calc.R + +`MIDS-calc.R` +* given a JSON schema and a dataset, this script calculates for each record which criteria are met and the MIDS level + +### Package checker +`packages.R` +* required packages (excluding imported packages) for the app. These will be auto-installed when running the app if not present and are also used to build the installer. + +### Installer generator +`rinno_installer.R` +* this script is not used by the app. It is run separately to generate an installer using RInno setup. [More info](/help/rinno_installer.md). + +### Shiny app +Code for the Shiny app can be found in the /src/Shiny_MIDS folder. + +`MIDScalcApp.R` +* main code for the app + +/src/Shiny_MIDS/R folder contains 4 modules: + * `CloseTabModule.R` + * allows to close results tabs + * `InteractiveSchemaModule.R` + * allows to edit the MIDS implementation interactively + * `ResultsModule.R` + * calculates MIDS levels and which criteria are met + * opens a tab showing the results for each analysis + * allows to export results to csv + * `ViewImplementationModule.R` + * allows to visualize the MIDS implementation diff --git a/help/howtoinstall.md b/help/howtoinstall.md new file mode 100644 index 0000000..d3463e4 --- /dev/null +++ b/help/howtoinstall.md @@ -0,0 +1,6 @@ +# Installation +The latest self-contained windows installer (generated using Inno Setup) can be found here: https://drive.google.com/drive/folders/1ioRhHIvdYI88yoPTsYLG_k5-8n-05CiP Note that this installer will install (a more recent) version of R if it is not available on your system already. The app currently works with R 4.2.0. It may function with older versions of R, but this cannot be guaranteed. + +Alternatively, you can download (or clone) the contents of this repository and run the code yourself using your local R instance. Running the `app.R` file should launch the app in your browser. Required R packages should automatically be installed if not present already during the first launch of the app. + +Due to the large memory requirements this app may pose, calculating scores for millions of specimens, a web-hosted version of this app is currently not available. \ No newline at end of file diff --git a/help/howtouse.md b/help/howtouse.md new file mode 100644 index 0000000..149db53 --- /dev/null +++ b/help/howtouse.md @@ -0,0 +1,12 @@ +# Using the app + +### Submit data + +On the app's initial interface, a zipped GBIF annotated Darwin Core Archive or a comma or tab separated occurrence file can be uploaded (max 5GB). During local use of the app, this simply means the dataset will be loaded into your local memory. + +Also on the interface, the MIDS implementation to be used can be viewed, as specified by the JSON schema. The schema lists all MIDS elements and how they are mapped to properties in the uploaded dataset. You can either choose the default schema included in the app or upload a custom schema from file. It is also possible to choose to edit this schema interactively in the app. The interactive editing opens in a pop-up window, where in a first tab, MIDS elements can be added, removed, or moved to another MIDS level. In addition, mappings can be removed or added by clicking the "edit" icon of a MIDS element. In a second tab, the Unknown or Missing section of the schema can be edited, i.e. new properties and new values can be added. This interactively edited schema can be saved to file (JSON). The schema (be it default, custom or interactive) can be viewed by clicking the eye icon, which opens a human-friendly visualization of the MIDS schema, so that it is not necessary to read the JSON file to be able to understand the specifics of the MIDS schema used. Once a dataset and a MIDS implementation have been chosen, calculations can be started by clicking "Start MIDS score calculations". + +### Results + +The results of each analysis are visualized on a new page, where it is possible to explore summaries of the results of both MIDS levels and MIDS elements, either as plots or as tables. The MIDS element plot can be clicked to get more details on the results of the mappings of that element. It is also possible to explore the complete records table with the MIDS results for each record, and to download it as a csv file. In addition, the data can be filtered to see how MIDS results change when filtering on properties such as country code /taxonomic group/ collection date. The filename of the dataset is shown, as well as the used MIDS implementation, to make the provenance of the calculations clear.' + diff --git a/help/jsonschema.md b/help/jsonschema.md new file mode 100644 index 0000000..7ca8834 --- /dev/null +++ b/help/jsonschema.md @@ -0,0 +1,42 @@ +# Structure of the JSON schema + +### Metadata: + +* `schemaName`: A label for the schema, no use outside of human readability. +* `schemaVersion`: Simple version tracker for when a schema receives an update. +* `date`: Date (and optionally time) for when this schema version was created. More reliable versioning property than schemaVersion. +* `schemaType`: Should always be `MIDScalculator`. + +### unknownOrMissing + +This section lists values for properties that are to be understood as absence, rather than presence, of data. That is, these are known values that indicate absence of data. + +If no `property` is listed for a value, the value is applied to any property mapped to a MIDS element. If a property is listed, the value is only ignored for that property. + +A `midsAchieved` flag is always present and typically `false`, to indicate this value is to be understood as absence. The schema parser does support negation, that is, a MIDS level that requires a certain property to be absent of data. This is implemented by specifying the `operator` as `NOT`. To ignore certain values for such a mapping, the `midsAchieved` flag would be `true`. Note that negation is currently not part of the default schema and also not supported in the interactive schema editor in the app (only by manually editing the JSON). + +### MIDS levels + +The four levels have their own section each. Each level should have at least one element as a condition, but can otherwise have any number of elements. + +Within an element, the mappings for this element are listed. Mappings to multiple properties in the data source are possible, and requirements of `OR` or `AND` can be set to specify the logic of the mapping. The operator `NOT` is also supported, but currently not part of the schema (anymore) and it is not implemented in the interface that enables interactive schema editing. Example of a more complex mapping for a Location term (note that this particular mapping is a suggestion, not currently agreed in the latest official MIDS draft): + +``` +"Location": [ + { + "property": [ + "decimalLatitude", + "decimalLongitude" + ], + "operator": "AND" + }, + { + "property": [ + "locality", + "county", + "verbatimLocality" + ], + "operator": "OR" + } +] +``` \ No newline at end of file diff --git a/help/rinno_installer.md b/help/rinno_installer.md new file mode 100644 index 0000000..71b79d8 --- /dev/null +++ b/help/rinno_installer.md @@ -0,0 +1,4 @@ +# Installer creation using RInno +* Follow instructions on https://github.com/ficonsulting/RInno to install the RInno package and the RInno Setup software. + * If Windows version not supported (64bit), install with installr following instructions here https://github.com/ficonsulting/RInno/issues/118#issuecomment-460094226 +* To create the installer, run [rinno_installer.R.](/src/rinno_installer.R) \ No newline at end of file diff --git a/infoafter.txt b/infoafter.txt index a55da10..800f190 100644 --- a/infoafter.txt +++ b/infoafter.txt @@ -1 +1 @@ -You have now succesfully installed the MIDSCalculator app. Running the executable file should launch the app in your Chrome browser. If you encounter any problems, you can find support at https://github.com/AgentschapPlantentuinMeise/MIDSCalculator/issues . \ No newline at end of file +You have now succesfully installed the MIDSCalculator app. Running the executable file should launch the app in your browser. If you encounter any problems, you can find support at https://github.com/AgentschapPlantentuinMeise/MIDSCalculator/issues . \ No newline at end of file