diff --git a/docs/_toc.yml b/docs/_toc.yml index 0e182b2..d57d292 100644 --- a/docs/_toc.yml +++ b/docs/_toc.yml @@ -22,4 +22,4 @@ parts: - file: content/03/03_00_Clip_to_vec - file: content/03/04_00_Spyndex - file: content/03/05_00_Count_valid - - file: content/03/06_00_STAC_Data + - file: content/03/06_00_STAC_data diff --git a/docs/content/03/06_00_STAC_Data.md b/docs/content/03/06_00_STAC_Data.md deleted file mode 100644 index d5d4303..0000000 --- a/docs/content/03/06_00_STAC_Data.md +++ /dev/null @@ -1,5 +0,0 @@ -# ...load data from remote STAC Catalogs? - -_Coming soon..._ - -![https://media.giphy.com/media/26vUKLfpzAS2QIVVK/giphy.gif](https://media.giphy.com/media/26vUKLfpzAS2QIVVK/giphy.gif) diff --git a/docs/content/03/06_00_STAC_data.ipynb b/docs/content/03/06_00_STAC_data.ipynb new file mode 100644 index 0000000..df8498f --- /dev/null +++ b/docs/content/03/06_00_STAC_data.ipynb @@ -0,0 +1,2341 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# ...load data from remote STAC Catalogs?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In order to load data products from remote [SpatioTemporal Asset Catalogs (STAC)](https://stacspec.org/en/), we can make use of the `load_from_stac` function provided by the `sdc-tools` package. Currently, this function supports loading data products hosted by [Microsoft Planetary Computer (MPC)](https://planetarycomputer.microsoft.com/catalog) and [Digital Earth Africa (DEA)](https://explorer.digitalearth.africa/).\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "```{warning}\n", + "Please be aware that working with remote data products might be quite inefficient. This is especially true, if the data is loaded with inappropriatly chosen parameters. Before loading a data product, you should get to know its basic characteristics. If you know the answer to at least the following questions, you are good to go:\n", + "- **What is the pixel spacing / resolution of the data?** \n", + " - Override the default `resolution` parameter if necessary.\n", + "- **Is the data categorical or continuous?** E.g., land cover is categorical, while spectral bands are continuous.\n", + " - If the data is categorical you should override the default `resampling` method to `'nearest'`.\n", + "- **In which datatype is the data stored and are there differences between the bands?** Common types are `uint8`, `uint16` and `float32`, for example. \n", + " - If there are differences in datatype between the bands you're interested in, it's probably best to load these separately by specifiying the `bands` parameter and using the appropriate `dtype` for each band.\n", + "\n", + "You should get an idea of how to handle these cases by having a look at the examples below. If something is unclear, please let me know!\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "```{note}\n", + "In both examples we will use the bounding box of an entire SALDi site as an example. If you have a specific area of interest, you can replace the bounding box with your own. E.g., by using the utility function `sdc.vec.get_vec_bounds`. In general it is recommended to try things out on a small subset first, before scaling up to larger areas and time periods.\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Example: Planetary Computer" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this example, we load the `MCD64A1 Version 6.1 Burned Area data product` from MPC. You can find more details about this data product [here](https://planetarycomputer.microsoft.com/dataset/modis-64A1-061) on the MPC website and [here](https://lpdaac.usgs.gov/products/mcd64a1v061/) from the original data source, the USGS Land Processes Distributed Active Archive Center (LP DAAC)." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[INFO] odc.stac.load parameters: {'crs': 'EPSG:4326', 'resolution': 0.005, 'resampling': 'nearest', 'chunks': {'time': 'auto', 'y': 'auto', 'x': 'auto'}}\n" + ] + }, + { + "data": { + "text/html": [ + "
<xarray.DataArray 'Burn_Date' (time: 71, latitude: 220, longitude: 260)> Size: 8MB\n", + "dask.array<Burn_Date, shape=(71, 220, 260), dtype=int16, chunksize=(71, 220, 260), chunktype=numpy.ndarray>\n", + "Coordinates:\n", + " * latitude (latitude) float64 2kB -24.9 -24.91 -24.91 ... -25.99 -26.0\n", + " * longitude (longitude) float64 2kB 30.75 30.76 30.76 ... 32.04 32.04 32.05\n", + " spatial_ref int32 4B 4326\n", + " * time (time) datetime64[ns] 568B 2018-01-01 2018-02-01 ... 2023-12-01\n", + "Attributes:\n", + " nodata: -1
<xarray.DataArray 'Burn_Date' (latitude: 220, longitude: 260)> Size: 458kB\n", + "dask.array<mean_agg-aggregate, shape=(220, 260), dtype=float64, chunksize=(220, 260), chunktype=numpy.ndarray>\n", + "Coordinates:\n", + " * latitude (latitude) float64 2kB -24.9 -24.91 -24.91 ... -25.99 -26.0\n", + " * longitude (longitude) float64 2kB 30.75 30.76 30.76 ... 32.04 32.04 32.05\n", + " spatial_ref int32 4B 4326\n", + "Attributes:\n", + " nodata: -1
<xarray.DataArray 'mask' (time: 1, latitude: 10000, longitude: 10500)> Size: 105MB\n", + "dask.array<mask, shape=(1, 10000, 10500), dtype=uint8, chunksize=(1, 10000, 10500), chunktype=numpy.ndarray>\n", + "Coordinates:\n", + " * latitude (latitude) float64 80kB -29.0 -29.0 -29.0 ... -30.0 -30.0 -30.0\n", + " * longitude (longitude) float64 84kB 26.45 26.45 26.45 ... 27.5 27.5 27.5\n", + " spatial_ref int32 4B 4326\n", + " * time (time) datetime64[ns] 8B 2019-01-01\n", + "Attributes:\n", + " nodata: 0
<xarray.DataArray 'mask' (latitude: 10000, longitude: 10500)> Size: 105MB\n", + "dask.array<getitem, shape=(10000, 10500), dtype=uint8, chunksize=(10000, 10500), chunktype=numpy.ndarray>\n", + "Coordinates:\n", + " * latitude (latitude) float64 80kB -29.0 -29.0 -29.0 ... -30.0 -30.0 -30.0\n", + " * longitude (longitude) float64 84kB 26.45 26.45 26.45 ... 27.5 27.5 27.5\n", + " spatial_ref int32 4B 4326\n", + " time datetime64[ns] 8B 2019-01-01\n", + "Attributes:\n", + " nodata: 0