This is a collection of notebooks giving a workflow for how Triple Collocation (TC) can be applied to evapotranspiration (ET) data to derived data uncertainties and subsequently evaluate agreement between ET data sets. The workflow has six parts, two associated with setting up the collocated ET data sets, two associated with applying TC to the ET data sets, one exploring the relative agreement between the ET data sets, and one exploring TC and agreement results after aggregating the ET data sets to certain geographic regions. They should be run in order and are as follows:
- Compiling the ET Monthly Data Sets
- Regridding the ET Data Sets
- Applying Triple Collocation Uncertainty Analysis
- Applying Extended Collocation Uncertainty Analysis
- Assessing the Relative Agreement between ET Data Sets
- Utilizing TC and Relative Agreement in Regional Analyses
Each notebook details the method taken in the workflow and discusses the results.
To perform the analysis in the workflows, two different functions were created to calculate the error variances using TC. These functions are placed within their own notebooks to accommodate the explanation of the mathematical background. The functions are:
tc_covar
, which computes the TC error variances using the covariance method.ec_covar
andec_covar_multi
, which computes the EC error covariance matrix and optional unbiased SNR for a collocated or multiple collocated inputs, respectively.
Note that
ec_covar_multi
is just a more generic version oftc_covar
andec_covar
. It produces the exact same results, but has the additional functionality that allows for performing EC along multiple collocated inputs simultaneously.
Besides applying TC to ET data sets, another set of example notebooks were produced to test the efficacy of TC on theoretical data sets to ensure it is producing the expected results. It is recommend to check these example notebooks out for additional background on the TC method. Additionally, an example notebook was created showing how to implement dask
with the created TC functions for use with large out of memory data sets.
To run the notebooks, they must first be downloaded by running:
git clone https://github.com/hytest-org/workflow-2023-doore-triple-collocation.git
Then the dependencies can be installed from the conda
environment file included in the repo via:
conda env create -f environment.yml
This will create an environment called hytest_tc_workflow
, which can be activated for use in running the workflows.
- Keith Doore - Lead Author - USGS Central Midwest Water Science Center
- Thomas M. Over - Contributing Author - USGS Central Midwest Water Science Center
- Timothy O. Hodson - Contributing Author - USGS Water Resources Mission Area
- Sydney S. Foks - Contributing Author - USGS Water Resources Mission Area
This project is licensed under the Creative Commons CC0 1.0 Universal License.
TBD