This is the official Research Compendium
(RC) documentation, with all the materials (Codes, data, and computing environments) needed for the reproduction, replication, and evaluation of the results presented in the paper:
Marujo, R. F. B. et al (2023)., Carlos, F. M., Costa, R. W., Arcanjo, J. S., Fronza, J. G., Soares, A. R., Queiroz, G. R., Ferreira, K. R. (2023)
A reproducible and replicable approach for harmonizing Landsat-8 and Sentinel-2 images
. Frontiers in Remote Sensing v.4. doi: 10.3389/frsen.2023.1254242
The organization defined for this RC, aims to facilitate the use of the codes implemented to generate the results presented in the article. The processing codes are made available in a structure of examples that allow the execution without difficulties, making it possible for others to reproduce and replicate the study performed.
These codes are stored in the 📁 analysis directory, which has three subdirectories:
-
📁 analysis/notebook: Directory with the Jupyter Notebook version of the processing flow implemented in the article associated with this RC. For more information, see the Reference Section Processing Scripts;
-
📁 analysis/pipeline: Directory with the Dagster version of the processing flow implemented in the article associated with this RC. For more information, see the Reference Section Processing Scripts;
-
📁 analysis/data: Directory for storing the generated input and output data. It contains the following subdirectories:
-
📁 examples: Directory with the data (Input/Output) of the examples provided in this RC. For more information about the examples, see Chapter Data Processing;
-
📁 original_scene_ids: Directory for storing the original scene id index files used to produce the article results. This data can be applied to the codes provided in the analysis/notebook and analysis/pipeline directories for reproducing the article results.
-
By default, the input data, because of the size of the files, is not stored directly in the data directory (analysis/data/
). Instead, as described in detail in the Reference Section Helper scripts, they are made available in the GitHub Release Assets of the RC repository.
To build the processing scripts available in the analysis
directory, we have created several software libraries and auxiliary scripts. The source code for some of these tools is available in the 📁 tools directory. In this directory there are four subdirectories, namely:
-
📁 tools/auxiliary-library: Source code for the research-processing library , which provides the high-level operations for processing the data in this RC;
-
📁 tools/calculate-checksum: Source code of script calculate-checksum, created to calculate the checksum of the files in this RC before sharing;
-
📁 tools/example-toolkit: Source code of the script example-toolkit, created to facilitate the download and validation of example data from the GitHub Release Assets;
-
📁 tools/github-asset-upload: Source code of the script github-asset-upload, created to facilitate the upload of example data to the GitHub Release Assets.
Another directory available in this RC is composes
. In this directory are Docker Compose configuration files for the computing environments needed to run the examples available in this RC. For more information about the RC computing environments, see the Reference Section Computing Environments.
In the composes
directory, there are two subdirectories:
-
📁 composes/minimal: Directory with the Docker Composes to run the Minimal example provided in this RC;
-
📁 composes/replication: Directory with the Docker Composes to run the Replication example provided in this RC.
For more information about the examples, see Section Data Processing.
Complementary to the composes
directory is the docker
directory. This directory holds the Dockerfile files used to build the environments used in Docker Composes. This directory has two subdirectories:
-
📁 docker/notebook: Directory with the Dockerfile of the environment required for running the Jupyter Notebook version of this RC process stream.
-
📁 docker/pipeline: Directory with the Dockerfile of the environment needed for running the Dagster version of this RC process stream.
In addition to these directories, some files are fundamental to using the materials in this RC:
-
Vagrantfile and bootstrap.sh: Vagrant files used to build a virtual machine with the complete environment for running the Processing scripts available in the
analysis
directory. For more information, see the reference section Computing Environments - Virtual Machine with Vagrant; -
Makefile:
GNU Make
definition file to make the use of the materials available in theanalysis
andcomposes
directories easier. The setenv.sh file is used byMakefile
to define the user who will run Jupyter Notebook environment. More information is provided in Section Data Processing.
To learn more about the materials, scripts, computing environments, and data, of this RC, please refer to the official documentation: https://github.com/brazil-data-cube/compendium-harmonization
Code : GPLv3;
Data : CC-0;
Text and figures: CC-BY-4.0;