Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem generating weights with DAYMET data #16

Open
mewieczo opened this issue Aug 26, 2021 · 11 comments
Open

Problem generating weights with DAYMET data #16

mewieczo opened this issue Aug 26, 2021 · 11 comments

Comments

@mewieczo
Copy link

DAYMET data has lat and long that is 2 dimensional and x and y that are 1 dimensional. I dropped lat and long, and when running pixel overlap I get the following error:

ValueError: cannot reindex or align along dimension 'lon' because the index has duplicate values

Any suggestions or help is appreciated.
Mike

@rsignell-usgs
Copy link

@ks905383: Could it be that xagg currently only works for gridded datasets that have 1D lon,lat coordinates?

The DAYMET data has 1D (x,y) Lambert Conformal Conic projected coordinates, so the (lon,lat) arrays are 2D.

@ks905383
Copy link
Owner

Yeah - it's currently only tested / written for rectangular grids.

Expanding processing to work with non-rectangular grids should not be the most complicated thing in the world, and might be achievable mainly through expanding the functionality of create_raster_polygons. Maybe let's keep this open until that's done.

In the meantime, I'll see about putting in a helpful error message if a non-rectangular grid is detected, or one where the coordinates aren't the dimensions etc.

@rsignell-usgs
Copy link

rsignell-usgs commented Aug 31, 2021

@ks905383, okay, it's not just user error. Good! 😄
Do you know if the rasterstats package might work for the use case of @mewieczo?

@rmcd-mscb
Copy link

Here is an example of how I generated cells for Daymet grids. It might be of some help.

https://github.com/nhm-usgs/onhm-fetcher-parser/blob/master/pkg/CalcDaymetWeights.py

@ks905383
Copy link
Owner

@ks905383, okay, it's not just user error. Good! 😄
Do you know if the rasterstats package might work for the use case of @mewieczo?

hm... it might. I personally haven't used rasterstats, but their method seems relatively robust (any pixel within the polygon boundary is included, either by center or by any part of the pixel), as long as the crs'es are well-defined? You won't get the area-averaging functionality of xagg, but depending on how much accuracy you need (i.e. how big are your pixels relative to your polygons, how bad is it if a pixel that only barely touches a polygon is included in the weighting) you might not need it anyways.

@ks905383
Copy link
Owner

Here is an example of how I generated cells for Daymet grids. It might be of some help.

https://github.com/nhm-usgs/onhm-fetcher-parser/blob/master/pkg/CalcDaymetWeights.py

oh nice, thanks for that - I'll take a look

@rmcd-mscb
Copy link

@ks905383 would be interested in helping add capability for processing structured non-uniform grids to xagg Kevin. I've got code to generate lat_bnds and lon_bnds. Wondering if you would have time and interest to meet and discuss how it would make sense to you to add that capability? Cheers Kevin.

@ks905383
Copy link
Owner

Hey @rmcd-mscb , apologies for the long radio silence here.

That actually sounds great - we could chat later this week or next week and see about integrating it?

@rmcd-mscb
Copy link

Thanks @ks905383 - I sent a calendar invite.

@ks905383
Copy link
Owner

ks905383 commented Oct 29, 2021

Hopefully, all major processing changes would only have to be in create_raster_polygons and get_bnds .

The remaining issues beyond that are largely semantic, in particular related to hardcoded calls to lat, lon. This may be fixable through implementing cf_xarray in xagg (Issue #1). Here's a quick (non-comprehensive) list of parts of the code where this might cause issues:

In xa.core

  • Line 157 needs to be verified to make sure it still manages to create the bounding box without adding extra dimensions
  • Line 172 should work fine without modification (broadcasting out the bounds), but there may be dimensional issues
  • Line 180 probably needs major changes
  • Generally, references to .lat and .lon may be problematic. This is maybe where cf_xarray may be needed to be incorporated
    • All the .stack(loc=('lat','lon')) calls, for example
    • Or the block of code starting in Line 205 about assigning lat/lon values to the geopandas pixel polygons
  • In process_weights(): xesmf can deal with non-rectangular grids, so this may be fine as is?

In xa.aux

  • xa.fix_ds() 's references to lat , lon need to be double-checked, specifically the sortby(ds.lat) /lon calls
  • xa.get_bnds() of course...
  • xa.subset_find()
    • subset_find is mainly used in optimization, to do calculations on just a geographic subset of the data. If it doesn't instantly work with non-rectangular grids, can just skip the subsetting if it's a non-rectangular grid with a warning.
    • (but you can feed stacked loc dim'ed arrays into it, so maybe that's easy to deal with?)

@rsignell-usgs
Copy link

rsignell-usgs commented Feb 24, 2022

I was thinking about using xagg with DAYMET again, and thinking, "okay, it doesn't have 1D lon,lat coordinates, but it does have 1D x,y, coordinates" (lambert conformal projection, see this notebook), so I'm wondering if I could just:

  • modify xagg to recognize x,y
  • supply geopandas polygons in the same projection
  • change the parts of the code that assume lon/lat (EPSG:4326) units to instead read the input grid and polygon CRS and then convert to EPSG:693{1,2,3}

I'm thinking the real limitation is the algorithm that needs 1D coordinates....

What would be the problems with this approach?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants