Skip to content
This repository has been archived by the owner on Sep 17, 2024. It is now read-only.

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
jnywong committed Sep 17, 2024
2 parents 3a9f2f9 + 9d4c037 commit 2605c97
Show file tree
Hide file tree
Showing 27 changed files with 877 additions and 2,596 deletions.
62 changes: 40 additions & 22 deletions book/02_Software_Tools/02_Data_Visualization_Tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,14 +41,12 @@ from geoviews import opts
### Displaying a basemap

<!-- #region jupyter={"source_hidden": true} -->
- Principal utility is `gv.tile_sources`
- Use the method `opts` to specify optional settings
- Bokeh menu at right enables interactive exploration
A *basemap* or *tile layer* is useful when displaying vector or raster data because it allows us to overlay the relevant geospatial data on a familar geographical map as a background. The principal utility is we'll use is `gv.tile_sources`. We can use the method `opts` to specify additional confirguration settings. Below, we use the *Open Street Map (OSM)* Web Map Tile Service to create the object `basemap`. When we display the repr for this object in the notebook cell, the Bokeh menu at right enables interactive exploration.
<!-- #endregion -->

```python jupyter={"source_hidden": true}
basemap = gv.tile_sources.OSM.opts(width=600, height=400)
basemap
basemap # When displayed, this basemap can be zoomed & panned using the menu at the right
```

### Plotting points
Expand All @@ -63,11 +61,7 @@ print(tokyo_lonlat)
```

<!-- #region jupyter={"source_hidden": true} -->
+ `geoviews.Points` accepts a list of tuples (each of the form `(x, y)`) to plot.
+ Use the OpenStreetMap tiles from `gv.tile_sources.OSM` as a basemap.
+ Overlay using the Holoviews operator `*`
+ Define the options using `geoviews.opts`
+ ? find a way to initialize the zoom level sensibly?
The class `geoviews.Points` accepts a list of tuples (each of the form `(x, y)`) & constructs a `Points` object that can be displayed. We can overlay the point created in the OpenStreetMap tiles from `basemap` using the `*` operator in Holoviews. We can also use `geoviews.opts` to set various display preferences for these points.
<!-- #endregion -->

```python jupyter={"source_hidden": true}
Expand All @@ -77,6 +71,7 @@ point_opts = opts.Points(
alpha=0.5,
color='red'
)
print(type(tokyo_point))
```

```python jupyter={"source_hidden": true}
Expand All @@ -98,41 +93,64 @@ point_opts = opts.Points(
(assuming $x_{\mathrm{max}}>x_{\mathrm{min}}$ & $y_{\mathrm{max}}>y_{\mathrm{min}}$) is as a single 4-tuple
$$(x_{\mathrm{min}},y_{\mathrm{min}},x_{\mathrm{max}},y_{\mathrm{max}}),$$
i.e., the lower,left corner coordinates followed by the upper, right corner coordinates.

Let's create a simple function to generate a rectangle of a given width & height given the centre coordinate.
<!-- #endregion -->

```python jupyter={"source_hidden": true}
# simple utility to make a rectangle of "half-width" dx & "half-height" dy & centred pt
def bounds(pt,dx,dy):
'''Returns rectangle represented as tuple (x_lo, y_lo, x_hi, y_hi)
given inputs pt=(x, y), half-width & half-height dx & dy respectively,
where x_lo = x-dx, x_hi=x+dx, y_lo = y-dy, y_hi = y+dy.
# simple utility to make a rectangle centered at pt of width dx & height dy
def make_bbox(pt,dx,dy):
'''Returns bounding-box represented as tuple (x_lo, y_lo, x_hi, y_hi)
given inputs pt=(x, y), width & height dx & dy respectively,
where x_lo = x-dx/2, x_hi=x+dx/2, y_lo = y-dy/2, y_hi = y+dy/2.
'''
return tuple(coord+sgn*delta for sgn in (-1,+1) for coord,delta in zip(pt, (dx,dy)))
return tuple(coord+sgn*delta for sgn in (-1,+1) for coord,delta in zip(pt, (dx/2,dy/2)))
```

<!-- #region jupyter={"source_hidden": true} -->
We can test the preceding function using the longitude-latitude coordinates of Marrakesh, Morocco.
<!-- #endregion -->

```python jupyter={"source_hidden": true}
# Verify that the function bounds works as intended
marrakesh_lonlat = (-7.93, 31.67)
dlon, dlat = 0.25, 0.25
marrakesh_rect = bounds(marrakesh_lonlat, dlon, dlat)
print(marrakesh_rect)
dlon, dlat = 0.5, 0.25
marrakesh_bbox = make_bbox(marrakesh_lonlat, dlon, dlat)
print(marrakesh_bbox)
```

<!-- #region jupyter={"source_hidden": true} -->
+ `geoviews.Rectangles` accepts a list of bounding boxes (each described by a tuple of the form `(x_min, y_min, x_max, y_max)`) to plot.
The utility `geoviews.Rectangles` accepts a list of bounding boxes (each described by a tuple of the form `(x_min, y_min, x_max, y_max)`) to plot. We can again use `geoviews.opts` to tailor the rectangle as needed.
<!-- #endregion -->

```python jupyter={"source_hidden": true}
rectangle = gv.Rectangles([marrakesh_bbox])
rect_opts = opts.Rectangles(
line_width=0,
alpha=0.25,
alpha=0.1,
color='red'
)
```

<!-- #region jupyter={"source_hidden": true} -->
We can plot a point for Marrakesh as before using `geoviews.Points` (customized using `geoviews.opts`).
<!-- #endregion -->

```python jupyter={"source_hidden": true}
marrakesh_point = gv.Points([marrakesh_lonlat])
point_opts = opts.Points(
size=48,
alpha=0.25,
color='blue'
)
```

<!-- #region jupyter={"source_hidden": true} -->
Finally, we can overlay all these features on the basemap with the options applied.
<!-- #endregion -->

```python jupyter={"source_hidden": true}
rectangle = gv.Rectangles([marrakesh_rect])
(basemap * rectangle).opts( rect_opts )
(basemap * rectangle * marrakesh_point).opts( rect_opts, point_opts )
```

<!-- #region jupyter={"source_hidden": true} -->
Expand Down
255 changes: 255 additions & 0 deletions book/03_Using_NASA_EarthData/01_Using_OPERA_DIST_Products.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,255 @@
---
jupyter:
jupytext:
text_representation:
extension: .md
format_name: markdown
format_version: '1.3'
jupytext_version: 1.16.2
kernelspec:
display_name: Python 3 (ipykernel)
language: python
name: python3
---

# Using OPERA DIST Products

<!-- #region editable=true slideshow={"slide_type": ""} -->
## The OPERA project
<!-- #endregion -->

<!-- #region editable=true jupyter={"source_hidden": true} slideshow={"slide_type": ""} -->
From the [OPERA (Observational Products for End-Users from Remote Sensing Analysis)](https://www.jpl.nasa.gov/go/opera) project:

>Started in April 2021, the Observational Products for End-Users from Remote Sensing Analysis (OPERA) project at the Jet Propulsion Laboratory collects data from satellite radar and optical instruments to generate six product suites:
>
> + a near-global Surface Water Extent product suite
> + a near-global Surface Disturbance product suite
> + a near-global Radiometric Terrain Corrected product
> + a North America Coregistered Single Look complex product suite
> + a North America Displacement product suite
> + a North America Vertical Land Motion product suite
That is, OPERA is a NASA initiative that takes, e.g., optical or radar remote-sensing data gathered from satellites and produces a variety of pre-processed data sets for public use. OPERA products are not raw satellite images; they are the result of algorithmic classification to determine, e.g., which land regions contain water or where vegetation has been displaced. The raw satellite images are collected from measurements made by the instruments onboard the Sentinel-1 A/B, Sentinel-2 A/B, and Landsat-8/9 satellite missions (hence the term *HLS* for "*Harmonized Landsat-Sentinel*" in numerous product descriptions).
<!-- #endregion -->

---


## The OPERA Land Surface Disturbance (DIST) product

<!-- #region editable=true jupyter={"source_hidden": true} slideshow={"slide_type": ""} -->
One of these OPERA data products is the *Land Surface Disturbance (DIST)* product (more fully described in the [OPERA DIST HLS product specification](https://lpdaac.usgs.gov/documents/1766/OPERA_DIST_HLS_Product_Specification_V1.pdf)). The DIST product maps per pixel vegetation disturbance (specifically, vegetation cover loss) from Harmonized Landsat-8 and Sentinel-2 A/B (HLS) scenes. One application of this data is to quantify damage due to wildfires in forests. Vegetation disturbance is mapped when there is an indicated decrease in vegetation cover within an HLS pixel. The DIST_ALERT product is released at regular intervals (the same cadence of HLS imagery, roughly every 12 days over a given tile/region); the DIST_ANN product summarizes disturbance measurements over a year.

The DIST products quanitfy surface reflectance (SR) data acquired from the Operational Land Imager (OLI) aboard the Landsat-8 remote sensing satellite and the Multi-Spectral Instrument (MSI) aboard the Sentinel-2 A/B remote-sensing satellite. The HLS products used to generate DIST data are distributed over tiles in projected map coordinates aligned with the [Military Grid Reference System (MGRS)](https://en.wikipedia.org/wiki/Military_Grid_Reference_System). Each tile covers 109.8 $km^2$ divided into 3660 rows and 3660 columns at 30 meter pixel spacing with tiles overlapping neighbors by 4900 meters in each direction (the details are fully described in the [product specification](https://lpdaac.usgs.gov/documents/1766/OPERA_DIST_HLS_Product_Specification_V1.pdf)).

The OPERA DIST products are distributed as [Cloud Optimized GeoTIFFs](https://www.cogeo.org/); in practice, this means that different bands are stored in distinct TIFF files. The TIFF specification does permit storage of multidimensional arrays in a single file; storing distinct *bands* in distinct TIFF files allows files to be downloaded independently.
<!-- #endregion -->

---

<!-- #region editable=true slideshow={"slide_type": ""} -->
### Band 1: Maximum Vegetation Loss Anomaly Value (VEG_ANOM_MAX)
<!-- #endregion -->

<!-- #region editable=true jupyter={"source_hidden": true} slideshow={"slide_type": ""} -->
Let's examine a local file with an example of DIST-ALERT data. The file contains the first band: the *maximum vegetation loss anomaly*. For each pixel, this is a value between 0% and 100% representing the percentage difference between current observed vegetation cover and a historical reference value. That is, a value of 100 corresponds to complete loss of vegetation within a pixel and a value of 0 corresponds to no loss of vegetation. The pixel values are stored as 8-bit unsigned integers (UInt8) because the pixel values need only range between 0 and 100. A pixel value of 255 indicates missing data, namely that the HLS data was unable to distill a maximum vegetation anomaly value for that pixel. Of course, using 8-bit unsigned integer data is a lot more efficient for storage and for transmitting data across a network (as compared to, e.g., 32- or 64-bit floating-point data).
<!-- #endregion -->

<!-- #region jupyter={"source_hidden": true} -->
Let's begin by importing the required libraries. Notice we're also pulling in some classes from the Bokeh library to make interactive plots a little nicer.
<!-- #endregion -->

```python editable=true jupyter={"source_hidden": true} slideshow={"slide_type": ""}
# Notebook dependencies
import warnings
warnings.filterwarnings('ignore')
from pathlib import Path
import rioxarray as rio
import hvplot.xarray
from bokeh.models import FixedTicker
import geoviews as gv
gv.extension('bokeh')
```

<!-- #region jupyter={"source_hidden": true} -->
We'll read the data from a local file and load it into an Xarray `DataArray` using `rioxarray.open_rasterio`. We'll do some relabelling to label the coordinates appropriately and we'll extract the CRS (coordinate reference system).
<!-- #endregion -->

```python jupyter={"source_hidden": true}
LOCAL_PATH = Path('..') / 'assets' / 'OPERA_L3_DIST-ALERT-HLS_T10TEM_20220815T185931Z_20220817T153514Z_S2A_30_v0.1_VEG-ANOM-MAX.tif'
data = rio.open_rasterio(LOCAL_PATH)
crs = data.rio.crs
data = data.rename({'x':'easting', 'y':'northing', 'band':'band'}).squeeze()
```

```python jupyter={"source_hidden": true}
data
```

```python jupyter={"source_hidden": true}
crs
```

<!-- #region jupyter={"source_hidden": true} -->
Before generating a plot, let's create a basemap using [ESRI](https://en.wikipedia.org/wiki/Esri) tiles.
<!-- #endregion -->

```python jupyter={"source_hidden": true}
# Creates basemap
base = gv.tile_sources.ESRI.opts(width=1000, height=1000, padding=0.1)
```

<!-- #region jupyter={"source_hidden": true} -->
We'll also use dictionaries to capture the bulk of the plotting options we'll use in conjunction with `.hvplot.image` later on.
<!-- #endregion -->

```python jupyter={"source_hidden": true}
image_opts = dict(
x='easting',
y='northing',
rasterize=True,
dynamic=True,
frame_width=500,
frame_height=500,
aspect='equal',
cmap='hot_r',
clim=(0, 100),
alpha=0.8
)
layout_opts = dict(
xlabel='Longitude',
ylabel='Latitude'
)
```

<!-- #region jupyter={"source_hidden": true} -->
Finally, we'll use the `DataArray.where` method to filter out missing pixels and the pixels that saw no change in vegetation, and we'll modify the options in `image_opts` and `layout_opts` to values particular to this dataset.
<!-- #endregion -->

```python jupyter={"source_hidden": true}
veg_anom_max = data.where((data>0) & (data!=255))
image_opts.update(crs=data.rio.crs)
layout_opts.update(title=f"VEG_ANOM_MAX")
```

<!-- #region jupyter={"source_hidden": true} -->
This allows us to generate a meaningful plot.
<!-- #endregion -->

```python jupyter={"source_hidden": true}
veg_anom_max.hvplot.image(**image_opts).opts(**layout_opts) * base
```

<!-- #region jupyter={"source_hidden": true} -->
In the resulting plot, the white and yellow pixels correspond to regions in which some deforestation has occurred, but not much. By contrast, the dark and black pixels correspond to regions that have lost almost all vegetation.
<!-- #endregion -->

---


### Band 2: Date of Initial Vegetation Disturbance (VEG_DIST_DATE)

<!-- #region editable=true jupyter={"source_hidden": true} slideshow={"slide_type": ""} -->
The DIST-ALERT products contain several bands (as summarized in the product specification). The second band we'll examine is the *date of initial vegetation disturbance* within the last year. This is stored as a 16-bit integer (Int16).

Let's use a local data file again, namely `OPERA_L3_DIST-ALERT-HLS_T10TEM_20220815T185931Z_20220817T153514Z_S2A_30_v0.1_VEG-DIST-DATE.tif`. The product specification describes how to file-naming conventions used; notable here is the *acquisition date & time* `20220815T185931`, i.e., almost 7pm (UTC) on August 15th, 2022.

We'll load and relabel the `DataArray` as before.
<!-- #endregion -->

```python jupyter={"source_hidden": true}
LOCAL_PATH = Path('..') / 'assets' / 'OPERA_L3_DIST-ALERT-HLS_T10TEM_20220815T185931Z_20220817T153514Z_S2A_30_v0.1_VEG-DIST-DATE.tif'
data = rio.open_rasterio(LOCAL_PATH)
data = data.rename({'x':'easting', 'y':'northing', 'band':'band'}).squeeze()
```

<!-- #region jupyter={"source_hidden": true} -->
In this particular band, the value 0 indicates no disturbance in the last year and -1 is a sentinel value indicating missing data. Any positive value is the number of days since Dec. 31, 2020 in which the first disturbance is measured in that pixel. We'll filter out the non-positive values and preserve these meaningful values using `DataArray.where`.
<!-- #endregion -->

```python jupyter={"source_hidden": true}
veg_dist_date = data.where(data>0)
```

<!-- #region jupyter={"source_hidden": true} -->
Let's examine the range of numerical values in `veg_dist_date` using `DataArray.min` and `DataArray.max`. Both of these methods will ignore pixels containing `nan` ("Not-a-Number") when computing the minimum and maximum.
<!-- #endregion -->

```python jupyter={"source_hidden": true}
d_min, d_max = int(veg_dist_date.min().item()), int(veg_dist_date.max().item())
print(f'{d_min=}\t{d_max=}')
```

<!-- #region jupyter={"source_hidden": true} -->
In this instance, the meaningful data lies between 247 and 592. Remember, this is the number of days elapsed since Dec. 31, 2020 when the first disturbance was observed in the last year. Since this data was acquired on Aug. 15, 2022, the only possible values would be between 227 and 592 days. So we need to recalibrate the colormap in the plot
<!-- #endregion -->

```python jupyter={"source_hidden": true}
image_opts.update(
clim=(d_min,d_max),
crs=data.rio.crs
)
layout_opts.update(title=f"VEG_DIST_DATE")
```

```python
veg_dist_date.hvplot.image(**image_opts).opts(**layout_opts) * base
```

<!-- #region jupyter={"source_hidden": true} -->
With thie colormap, the lightest pixels showed some signs of deforestation close to a year ago. By contrast, the black pixels first showed deforestation close to the time of data acquisition. This band, then, is useful for tracking progress of wildfires as they sweep across forests.
<!-- #endregion -->

---

<!-- #region editable=true slideshow={"slide_type": ""} -->
### Band 3: Vegetation Disturbance Status (VEG_DIST_STATUS)
<!-- #endregion -->

<!-- #region editable=true jupyter={"source_hidden": true} slideshow={"slide_type": ""} -->
Finally, let's take a look at a third band from the DIST-ALERT data product family, namely the *vegetation disturbance status*. These pixel values are stored as 8-bit unsigned integers; there are only 6 distinct values stored:
* **0:** No disturbance
* **1:** Provisional (**first detection**) disturbance with vegetation cover change <50%
* **2:** Confirmed (**recurrent detection**) disturbance with vegetation cover change < 50%
* **3:** Provisional disturbance with vegetation cover change ≥ 50%
* **4:** Confirmed disturbance with vegetation cover change ≥ 50%
* **255**: Missing data

A pixel value is flagged *provisionally* changed when the vegetation cover loss (disturbance) is first observed by a satellite; if the change is noticed again in subsequent HLS acquisitions over that pixel, then the pixel is flagged as *confirmed*.
<!-- #endregion -->

<!-- #region editable=true jupyter={"source_hidden": true} slideshow={"slide_type": ""} -->
We can use a local file as an example of this particular layer/band of the DIST-ALERT data. The code is the same as above, but do observe:
+ the data filtered reflects the meaning pixel values for this layer (i.e., `data<0` and `data<5`); and
+ the limits on the colormap are reassigned accordingly (i.e., from 0 to 4).
<!-- #endregion -->

```python jupyter={"source_hidden": true}
LOCAL_PATH = Path('..') / 'assets' / 'OPERA_L3_DIST-ALERT-HLS_T10TEM_20220815T185931Z_20220817T153514Z_S2A_30_v0.1_VEG-DIST-STATUS.tif'
data = rio.open_rasterio(LOCAL_PATH)
data = data.rename({'x':'easting', 'y':'northing', 'band':'band'}).squeeze()
```

```python jupyter={"source_hidden": true}
veg_dist_status = data.where((data>0)&(data<5))
image_opts.update(crs=data.rio.crs)
```

```python jupyter={"source_hidden": true}
layout_opts.update(
# title=f"VEG_DIST_STATUS",
clim=(0,4),
colorbar_opts={'ticker': FixedTicker(ticks=[0, 1, 2, 3, 4])}
)
```

```python
veg_dist_status.hvplot.image(**image_opts).opts(**layout_opts) * base
```

<!-- #region jupyter={"source_hidden": true} -->
This continuous colormap does not highlight the features in this plot particularly well. A better choice would be a *categorical* colormap. We'll see how to achieve this in the next notebook (with the OPERA DSWx data products).
<!-- #endregion -->

---
Loading

0 comments on commit 2605c97

Please sign in to comment.