Skip to content
This repository has been archived by the owner on Sep 17, 2024. It is now read-only.

Commit

Permalink
Merge pull request ScienceCore#86 from jnywong/update-environment
Browse files Browse the repository at this point in the history
Update environment
  • Loading branch information
jnywong authored Jul 9, 2024
2 parents 6574496 + e1ea8a9 commit 1ca98d0
Show file tree
Hide file tree
Showing 10 changed files with 713 additions and 24 deletions.
92 changes: 92 additions & 0 deletions assessment/Assessment-form.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@

# Assessment form for the Scipy tutorial: preliminar questions

### What is the primary difference between geographic and projected coordinate reference systems?

- Geographic coordinate reference systems use latitude and longitude while projected coordinate reference systems use XY coordinates.
- Geographic coordinate reference systems are three-dimensional while projected coordinate reference systems are two-dimensional.
- Projected coordinate reference systems can only be used in the southern hemisphere.
- Geographic coordinate reference systems use the prime meridian as the origin while projected coordinate reference systems use the equator.

### What is the reference line for latitude in the geographic coordinate system?

- The prime meridian
- The equator
- The North pole
- The South pole

### Which statement is true about lines of longitude?

- They run parallel to the equator.
- They are assigned positive values in the southern hemisphere.
- They converge at the poles.
- The distance between lines of longitude is the same at all latitudes.

### What is the origin point for the Universal Transverse Mercator (UTM) coordinate reference system?

- The North pole
- The South pole
- The equator at a specific Longitude
- Greenwich England

### Which file is mandatory for a shapefile to represent spatial vector data?

- .prj
- .xml
- .shp
- .cpg

### What type of data does GeoJSON format encode?

- Raster data
- Vector data
- Both raster and vector data
- Metadata for geospatial images

---

### NASA EarthData

**Have you accessed NASA EarthData Cloud before this tutorial?**

- Yes
- No

**How difficult do you find the process of accessing data from NASA EarthData Cloud?**

- Very easy: I could access and retrieve data without any issues.
- Somewhat easy: I could access data but I encountered minor difficulties.
- Challenging: I encountered significant difficulties or was unable to access the data.

**Have you used NASA EarthData products in your research or projects before this tutorial?**

- Yes
- No

**Do you feel confident in extracting and processing data from NASA EarthData Cloud after this tutorial?**

- Very confident: I can extract and process data independently and efficiently.
- Somewhat confident: I can extract and process data but may need occasional assistance.
- Not confident: I still feel unsure about extracting and processing data on my own.

---

### Tutorial

**Please indicate your level of agreement with the following statements regarding the delivery of the tutorial:**

| | Strongly agree | Agree | Neutral | Disagree | Strongly disagree |
|----------------------------------------------------|----------------|-------|---------|----------|-------------------|
| The materials provided for the tutorial were adequate. | | | | | |
| The activities were clear and easy to follow. | | | | | |
| The examples were effective in helping me understand the concepts of cloud-based data analysis. | | | | | |
| The theoretical concepts were sufficiently explained. | | | | | |
| There was enough hands-on practice. | | | | | |
| There was a good balance between theory and hands-on activities. | | | | | |
| The time allocated to each task was sufficient. | | | | | |

**Please write down any additional suggestions or observations you may have.**

_________________________________________________________________________________________________

Link to the form: http://tiny.cc/ClimateriskScipy2024
9 changes: 9 additions & 0 deletions book/03_Geospatial_data_files/geographic_data_formats.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,15 @@ A GeoTIFF file extension contains geographic metadata that describes the actual
Geodata is drawn from vector formats on a map, and the geodata is converted to the specified output projection of the map if the projection in the source file differs. Some of the vector and raster formats typically supported by a GeoTIFF online viewer include: asc, gml, gpx, json, kml, kmz, mid, mif, osm, tif, tab, map, id, dat, gdbtable, and gdbtablx.^2^


## A Note on the Ordering of Coordinates in Python GIS Libraries

In Python, 2D raster data are represented as matrices. Typically, the x-dimension of the array corresponds to `easting` or `longitude`, while the y-dimension corresponds to `northing` or `latitude`. However, the convention for listing the dimensions of matrices is to list the number of rows first and columns second. This means a matrix with dimensions (n, m) has `n` rows (latitude bins) and `m` columns (longitude bins). This convention is reflected when querying the shape of a `numpy` array using the `shape` attribute.

On the other hand, vector shapes, such as Shapely `Points`, follow the (longitude, latitude) notation. For example, the coordinates for Livingston, TX, are specified as `livingston_tx = Point(-95.09, 30.69)`. Similarly, `Polygon` bounds are specified in the order `(longitude_0, latitude_0, longitude_1, latitude_1)`.

It is important to keep these coordinate orderings in mind when working with geospatial data, as they can easily become a source of confusion or errors.


## References

1. https://www.geographyrealm.com/geodatabases-explored-vector-and-raster-data/
Expand Down
37 changes: 23 additions & 14 deletions book/04_NASA_Earthdata/0_Initial_Setup.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,40 @@
# Initial Setup

## 1- How to use the 2i2c Hub
## 1- Accesing to the 2i2c Hub
To login to the 2i2c Hub, follow these simple steps:

To access the 2i2c Hub, follow these simple steps:
* **Head to the Hub:** Visit this link to access the 2i2c Hub: https://climaterisk.opensci.2i2c.cloud/.

* Access the 2i2c Hub: Go to https://2i2c.org/platform/
* **Log in with your Credentials:**

# Initial Configuration Steps for 2i2c Hub and EarthData NASA Access
**Username:** Feel free to choose any username you like. Te recommend your GitHub username helps avoid conflicts with others.

## 1. Accessing the 2i2c Hub
**Password:** You'll receive the password the day before the tutorial.

To access the 2i2c Hub, follow these simple steps:

* Go to the 2i2c Hub.

![2i2c_login](../assets/2i2c_login.png)

* Enter your credentials: username and password (Note: You must have previously sent your Github account username to be enabled for access with that account).

* If the login is successful, you will see the following screen. Choose the Start option to enter the JupyterLab environment in the cloud.

* **Logging In:**

The login process might take a few minutes, especially if a new virtual workspace needs to be created just for you.


![start_server2](../assets/start_server_2i2c.png)


* **What to Expect:**

By default, logging into https://climaterisk.opensci.2i2c.cloud will automatically clone https://github.com/ScienceCore/scipy-2024-climaterisk and change to that directly. If the login is successful, you will see the following screen.


![work_environment_jupyter_lab](../assets/work_environment_jupyter_lab.png)

Finally, if you see the previous JupyterLab screen, you are ready to start working.

![2i2c_login](../assets/start_server.png)
**Notes:** Any files you work on will be saved between sessions as long as you use the same username.

* Finally, if you see the following JupyterLab screen, you are ready to start working.

![ambiente_trabajo_jupyter_lab](../assets/work_environment_jupyter_lab.png)

## 2. Using NASA's Earthdata

Expand Down
130 changes: 122 additions & 8 deletions book/07_Wildfire_analysis/Retrieving_Disturbance_Data.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ jupyter:
extension: .md
format_name: markdown
format_version: '1.3'
jupytext_version: 1.16.2
jupytext_version: 1.16.1
kernelspec:
display_name: Python 3 (ipykernel)
language: python
Expand All @@ -31,6 +31,7 @@ from osgeo import gdal
from rasterio.merge import merge
import rasterio
import contextily as cx
import folium

# data wrangling imports
import pandas as pd
Expand Down Expand Up @@ -93,19 +94,23 @@ print(f"Number of tiles found intersecting given AOI: {len(results)}")
Let's load the search results into a pandas dataframe

```python
layer_name = 'VEG-DIST-STATUS'
def search_to_df(results, layer_name = 'VEG-DIST-STATUS'):

times = pd.DatetimeIndex([result['properties']['datetime'] for result in results]) # parse of timestamp for each result
data = {'hrefs': [value['href'] for result in results for key, value in result['assets'].items() if layer_name in key], # parse out links only to DIST-STATUS data layer
'tile_id': [value['href'].split('/')[-1].split('_')[3] for result in results for key, value in result['assets'].items() if layer_name in key]}
times = pd.DatetimeIndex([result['properties']['datetime'] for result in results]) # parse of timestamp for each result
data = {'hrefs': [value['href'] for result in results for key, value in result['assets'].items() if layer_name in key], # parse out links only to DIST-STATUS data layer
'tile_id': [value['href'].split('/')[-1].split('_')[3] for result in results for key, value in result['assets'].items() if layer_name in key]}

# # Construct pandas dataframe to summarize granules from search results
granules = pd.DataFrame(index=times, data=data)
granules.index.name = 'times'
# Construct pandas dataframe to summarize granules from search results
granules = pd.DataFrame(index=times, data=data)
granules.index.name = 'times'

return granules
```

```python
granules = search_to_df(results)
granules.head()

```

```python
Expand Down Expand Up @@ -213,3 +218,112 @@ plt.xlabel('Date', size=15)
plt.xticks([datetime(year=2023, month=8, day=1) + timedelta(days=6*i) for i in range(11)], size=14)
plt.title('2023 Dadia forest wildfire detected extent', size=14)
```

### Great Green Wall, Sahel Region, Africa

```python
ndiaye_senegal = Point(-16.09, 16.50)

# We will search data through the product record
start_date = datetime(year=2022, month=1, day=1)
stop_date = datetime.now()
```

```python
# Plotting search location in folium as a sanity check
m = folium.Map(location=(ndiaye_senegal.y, ndiaye_senegal.x), control_scale = True, zoom_start=9)
radius = 5000
folium.Circle(
location=[ndiaye_senegal.y, ndiaye_senegal.x],
radius=radius,
color="red",
stroke=False,
fill=True,
fill_opacity=0.6,
opacity=1,
popup="{} pixels".format(radius),
tooltip="50 px radius",
#
).add_to(m)

m
```

```python
# We open a client instance to search for data, and retrieve relevant data records
STAC_URL = 'https://cmr.earthdata.nasa.gov/stac'

# Setup PySTAC client
# LPCLOUD refers to the LP DAAC cloud environment that hosts earth observation data
catalog = Client.open(f'{STAC_URL}/LPCLOUD/')

collections = ["OPERA_L3_DIST-ANN-HLS_V1"]

# We would like to search data for August-September 2023
date_range = f'{start_date.strftime("%Y-%m-%d")}/{stop_date.strftime("%Y-%m-%d")}'

opts = {
'bbox' : ndiaye_senegal.bounds,
'collections': collections,
'datetime' : date_range,
}

search = catalog.search(**opts)
results = list(search.items_as_dicts())
print(f"Number of tiles found intersecting given AOI: {len(results)}")
```

```python
def urls_to_dataset(granule_dataframe):
'''method that takes in a list of OPERA tile URLs and returns an xarray dataset with dimensions
latitude, longitude and time'''

dataset_list = []

for i, row in granule_dataframe.iterrows():
with rasterio.open(row.hrefs) as ds:
# extract CRS string
crs = str(ds.crs).split(':')[-1]

# extract the image spatial extent (xmin, ymin, xmax, ymax)
xmin, ymin, xmax, ymax = ds.bounds

# the x and y resolution of the image is available in image metadata
x_res = np.abs(ds.transform[0])
y_res = np.abs(ds.transform[4])

# read the data
img = ds.read()

# Ensure img has three dimensions (bands, y, x)
if img.ndim == 2:
img = np.expand_dims(img, axis=0)



lon = np.arange(xmin, xmax, x_res)
lat = np.arange(ymax, ymin, -y_res)

lon_grid, lat_grid = np.meshgrid(lon, lat)

da = xr.DataArray(
data=img,
dims=["band", "y", "x"],
coords=dict(
lon=(["y", "x"], lon_grid),
lat=(["y", "x"], lat_grid),
time=i,
band=np.arange(img.shape[0])
),
attrs=dict(
description="OPERA DIST ANN",
units=None,
),
)
da.rio.write_crs(crs, inplace=True)

dataset_list.append(da)
return xr.concat(dataset_list, dim='time').squeeze()

dataset= urls_to_dataset(granules)
```
Loading

0 comments on commit 1ca98d0

Please sign in to comment.