From 80281b3ed170f542b3734f62022b86df95f7fcd3 Mon Sep 17 00:00:00 2001 From: Hauke Schulz <43613877+observingClouds@users.noreply.github.com> Date: Sun, 15 Oct 2023 22:04:02 -0700 Subject: [PATCH] create howto section with additional citation recommendations --- how_to_eurec4a/_toc.yml | 3 + how_to_eurec4a/howto.md | 123 +-------------------------- how_to_eurec4a/howto_add_dataset.md | 122 ++++++++++++++++++++++++++ how_to_eurec4a/howto_cite_dataset.md | 6 ++ 4 files changed, 132 insertions(+), 122 deletions(-) create mode 100644 how_to_eurec4a/howto_add_dataset.md create mode 100644 how_to_eurec4a/howto_cite_dataset.md diff --git a/how_to_eurec4a/_toc.yml b/how_to_eurec4a/_toc.yml index 470c96e..775e05a 100644 --- a/how_to_eurec4a/_toc.yml +++ b/how_to_eurec4a/_toc.yml @@ -2,6 +2,9 @@ format: jb-book root: intro chapters: - file: howto + sections: + - file: howto_cite_dataset + - file: howto_add_dataset - file: halo sections: - file: HALO_instrument_overview diff --git a/how_to_eurec4a/howto.md b/how_to_eurec4a/howto.md index 849ed46..24ba4ab 100644 --- a/how_to_eurec4a/howto.md +++ b/how_to_eurec4a/howto.md @@ -1,124 +1,3 @@ # How-To -## How-To add new datasets - -The [EUREC$^4$A-Intake catalog](https://github.com/eurec4a/eurec4a-intake) is the dataset collection of the EUREC$^4$A and ATOMIC field campaign. It is a collection of `yaml`-files that contain references to the dataset storage locations. - -Datasets should be added by following these steps: - -### via the command line -1. Cloning the [EUREC$^4$A-Intake catalog](https://github.com/eurec4a/eurec4a-intake) repository with - ```bash - git clone git@github.com:eurec4a/eurec4a-intake.git - ``` - You might be asked to [set up an SSH key for your GitHub account](https://docs.github.com/en/authentication/connecting-to-github-with-ssh/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent). -2. Create a new branch - ```bash - git checkout -b - ``` -3. Add new catalog entry - The catalog contains two types of entries: - - references to sub-catalogs - - references to a dataset - - The surface flux dataset from the research vessel Meteor is accessible via - ```python - cat.Meteor.surface_fluxes - ``` - and is saved in the file [Meteor/main.yaml](https://github.com/eurec4a/eurec4a-intake/blob/master/Meteor/main.yaml). - The radar dataset, which is more complex and contains several subsets is accessible via - ```python - cat.Meteor.LIMRAD94.low_res - cat.Meteor.LIMRAD94.high_res - ``` - For the creation of the `LIMRAD94` subset, a sub-catalog reference has been created in [Meteor/main.yaml](https://github.com/eurec4a/eurec4a-intake/blob/master/Meteor/main.yaml). The final reference to the dataset is added in [Meteor/LIMRAD94.yaml](https://github.com/eurec4a/eurec4a-intake/blob/master/Meteor/LIMRAD94.yaml) - - Depending on the complexity of the dataset, an entry can be directly added to the `main.yaml` file of the respective platform/simulation - ```bash - vi /main.yaml - ``` - - or to a (new) instrument specific file - - ``` - vi /.yaml - ``` - - The reference has the following format: - ```yaml - plugins: - source: - - module: intake_xarray - - sources: - : - description: - driver: opendap - args: - auth: null - urlpath: - chunks: {} - engine: netcdf4 - ``` - - In case your dataset has been published on AERIS, the THREDDS link to your dataset can be determined by finding your dataset at [https://observations.ipsl.fr/aeris/eurec4a-data/](https://observations.ipsl.fr/aeris/eurec4a-data/). To retrieve the THREDDS link, replace `https://observations.ipsl.fr/aeris/eurec4a-data/` with `https://observations.ipsl.fr/thredds/dodsC/EUREC4A/`. You can check if the link is correct by opening it e.g. directly with `xarray.open_dataset()` or [Panoply](https://www.giss.nasa.gov/tools/panoply/) by opening a `Remote Dataset`. - - A sub-catalog reference can be created with - ```yaml - sources: - : - args: - path: "{{CATALOG_DIR}}/.yaml" - description: '' - driver: yaml_file_cat - metadata: {} - ``` - - Please also check out entries already present in the EUREC$^4$A-Intake catalog. - - Finally those changes need to be staged and committed. - ```bash - git add -p - ``` - ```bash - git commit -m "" - ``` - -4. Push branch to GitHub - ```bash - git push --set-upstream origin - ``` - -5. Create pull request on GitHub - A pull request can be started on the GitHub webpage. After the pull request has been submitted, the review process will start. To accelerate the process, please make sure all tests for your pull request succeed. The status of the tests are shown at the bottom of your pull request. - -### via GitHub web interface - -1. Visit EUREC$^4$A-Intake catalog repository - Go to [https://github.com/eurec4a/eurec4a-intake](https://github.com/eurec4a/eurec4a-intake) -2. Login with your GitHub credentials -3. Select the platform/simulation/product for which a new dataset entry shall be added - image -4. Edit image - `main.yaml` and add a reference to the dataset if it is simple and does not contain different subsets (e.g. resolutions, frequencies, sensors, dimensions): - ```yaml - plugins: - source: - - module: intake_xarray - - sources: - : - description: - driver: opendap - args: - auth: null - urlpath: - chunks: {} - engine: netcdf4 - ``` - (see 3. of **via command line** for more details -5. Save changes and create pull request - image -6. Check if the automatic tests that are run on your edits succeed. You can access the tests e.g. via the Action Tab (image -). -7. A reviewer will look at the pull request and will discuss any additional steps necessary to merge the changes to the main repository. +Explanations on how to use this interactive book, it's underlying intake-catalog and how to contribute and cite this work and its underlying datasets. diff --git a/how_to_eurec4a/howto_add_dataset.md b/how_to_eurec4a/howto_add_dataset.md new file mode 100644 index 0000000..974bfd0 --- /dev/null +++ b/how_to_eurec4a/howto_add_dataset.md @@ -0,0 +1,122 @@ +# How-To add new datasets + +The [EUREC$^4$A-Intake catalog](https://github.com/eurec4a/eurec4a-intake) is the dataset collection of the EUREC$^4$A and ATOMIC field campaign. It is a collection of `yaml`-files that contain references to the dataset storage locations. + +Datasets should be added by following these steps: + +## via the command line +1. Cloning the [EUREC$^4$A-Intake catalog](https://github.com/eurec4a/eurec4a-intake) repository with + ```bash + git clone git@github.com:eurec4a/eurec4a-intake.git + ``` + You might be asked to [set up an SSH key for your GitHub account](https://docs.github.com/en/authentication/connecting-to-github-with-ssh/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent). +2. Create a new branch + ```bash + git checkout -b + ``` +3. Add new catalog entry + The catalog contains two types of entries: + - references to sub-catalogs + - references to a dataset + + The surface flux dataset from the research vessel Meteor is accessible via + ```python + cat.Meteor.surface_fluxes + ``` + and is saved in the file [Meteor/main.yaml](https://github.com/eurec4a/eurec4a-intake/blob/master/Meteor/main.yaml). + The radar dataset, which is more complex and contains several subsets is accessible via + ```python + cat.Meteor.LIMRAD94.low_res + cat.Meteor.LIMRAD94.high_res + ``` + For the creation of the `LIMRAD94` subset, a sub-catalog reference has been created in [Meteor/main.yaml](https://github.com/eurec4a/eurec4a-intake/blob/master/Meteor/main.yaml). The final reference to the dataset is added in [Meteor/LIMRAD94.yaml](https://github.com/eurec4a/eurec4a-intake/blob/master/Meteor/LIMRAD94.yaml) + + Depending on the complexity of the dataset, an entry can be directly added to the `main.yaml` file of the respective platform/simulation + ```bash + vi /main.yaml + ``` + + or to a (new) instrument specific file + + ``` + vi /.yaml + ``` + + The reference has the following format: + ```yaml + plugins: + source: + - module: intake_xarray + + sources: + : + description: + driver: opendap + args: + auth: null + urlpath: + chunks: {} + engine: netcdf4 + ``` + + In case your dataset has been published on AERIS, the THREDDS link to your dataset can be determined by finding your dataset at [https://observations.ipsl.fr/aeris/eurec4a-data/](https://observations.ipsl.fr/aeris/eurec4a-data/). To retrieve the THREDDS link, replace `https://observations.ipsl.fr/aeris/eurec4a-data/` with `https://observations.ipsl.fr/thredds/dodsC/EUREC4A/`. You can check if the link is correct by opening it e.g. directly with `xarray.open_dataset()` or [Panoply](https://www.giss.nasa.gov/tools/panoply/) by opening a `Remote Dataset`. + + A sub-catalog reference can be created with + ```yaml + sources: + : + args: + path: "{{CATALOG_DIR}}/.yaml" + description: '' + driver: yaml_file_cat + metadata: {} + ``` + + Please also check out entries already present in the EUREC$^4$A-Intake catalog. + + Finally those changes need to be staged and committed. + ```bash + git add -p + ``` + ```bash + git commit -m "" + ``` + +4. Push branch to GitHub + ```bash + git push --set-upstream origin + ``` + +5. Create pull request on GitHub + A pull request can be started on the GitHub webpage. After the pull request has been submitted, the review process will start. To accelerate the process, please make sure all tests for your pull request succeed. The status of the tests are shown at the bottom of your pull request. + +## via GitHub web interface + +1. Visit EUREC$^4$A-Intake catalog repository + Go to [https://github.com/eurec4a/eurec4a-intake](https://github.com/eurec4a/eurec4a-intake) +2. Login with your GitHub credentials +3. Select the platform/simulation/product for which a new dataset entry shall be added + image +4. Edit image + `main.yaml` and add a reference to the dataset if it is simple and does not contain different subsets (e.g. resolutions, frequencies, sensors, dimensions): + ```yaml + plugins: + source: + - module: intake_xarray + + sources: + : + description: + driver: opendap + args: + auth: null + urlpath: + chunks: {} + engine: netcdf4 + ``` + (see 3. of **via command line** for more details +5. Save changes and create pull request + image +6. Check if the automatic tests that are run on your edits succeed. You can access the tests e.g. via the Action Tab (image +). +7. A reviewer will look at the pull request and will discuss any additional steps necessary to merge the changes to the main repository. diff --git a/how_to_eurec4a/howto_cite_dataset.md b/how_to_eurec4a/howto_cite_dataset.md new file mode 100644 index 0000000..7c6d022 --- /dev/null +++ b/how_to_eurec4a/howto_cite_dataset.md @@ -0,0 +1,6 @@ +# How-To cite the EUREC$^4$A datasets + +This How-To book is based on two main components, the EUREC$^4$A Intake catalog and the datasets referenced therein. + +The EUREC$^4$ Intake catalog is published at [![](https://zenodo.org/badge/doi/10.5281/zenodo.8422321.svg)](https://doi.org/10.5281/zenodo.8422321) and can be cited accordingly. +Please note however that this is a collection DOI and will always point to the most recent published version of the catalog.