Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implementation of resample_spatial to reproject values #103

Merged
merged 34 commits into from
Aug 17, 2023

Conversation

SerRichard
Copy link
Collaborator

@SerRichard SerRichard commented May 5, 2023

Dask implementation of resample spatial using the odc.algo reproject function. I've tried to handle some trickier parts (like infering the affine transformation) with some helper utils. Ready for review when all boxes are ticked!

  • Reprojection
  • Resampling
  • Unit tests

@SerRichard SerRichard changed the title basic implementation of resample_spatial to reproject values Implementation of resample_spatial to reproject values May 5, 2023
@LukeWeidenwalker LukeWeidenwalker changed the title Implementation of resample_spatial to reproject values Draft: Implementation of resample_spatial to reproject values May 10, 2023
@codecov
Copy link

codecov bot commented May 15, 2023

Codecov Report

Merging #103 (57af9be) into main (9f184e5) will increase coverage by 0.12%.
Report is 1 commits behind head on main.
The diff coverage is 83.87%.

@@            Coverage Diff             @@
##             main     #103      +/-   ##
==========================================
+ Coverage   78.82%   78.95%   +0.12%     
==========================================
  Files          27       28       +1     
  Lines        1176     1207      +31     
==========================================
+ Hits          927      953      +26     
- Misses        249      254       +5     
Files Changed Coverage Δ
...cesses_dask/process_implementations/cubes/utils.py 86.66% <ø> (ø)
...neo_processes_dask/process_implementations/math.py 59.45% <0.00%> (-0.41%) ⬇️
...ses_dask/process_implementations/cubes/resample.py 86.66% <86.66%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@LukeWeidenwalker
Copy link
Contributor

Just to leave it here too: Since opendatacube/odc-geo#88 came out, it would probably be good to see whether this already does this!

@SerRichard
Copy link
Collaborator Author

Just to leave it here too: Since opendatacube/odc-geo#88 came out, it would probably be good to see whether this already does this!

Switched implementation to use the odc.reproject function, much less code to maintain now!

@ValentinaHutter One thing I'm not sure on. Odc reproject will automatically resample the raster to the provided projection. In the case a user provides no resolution to resample, and only wants a reproject, should we force the resolution of the input cube? Or just use accept resolution as none? Not sure what the preference would be.

@SerRichard
Copy link
Collaborator Author

@ValentinaHutter One other concern, we've no way or reliably enforcing the convention of the user specifying the target resolution in the units of the output crs. Scao recommended this could be enforced with accepted rangers for different CRsystems. Example, if the user incorrectly specifies 0.0009 as their output resolution, and the crs is in metres, this will run for longer. Not sure how much longer on dask, so may not be a huge concern.

Copy link
Collaborator

@ValentinaHutter ValentinaHutter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

tests/test_resample.py Show resolved Hide resolved
@ValentinaHutter
Copy link
Collaborator

Just to leave it here too: Since opendatacube/odc-geo#88 came out, it would probably be good to see whether this already does this!

Switched implementation to use the odc.reproject function, much less code to maintain now!

@ValentinaHutter One thing I'm not sure on. Odc reproject will automatically resample the raster to the provided projection. In the case a user provides no resolution to resample, and only wants a reproject, should we force the resolution of the input cube? Or just use accept resolution as none? Not sure what the preference would be.

In https://processes.openeo.org/#resample_spatial the default resolution=0 and says that it Doesn't change the resolution by default (0). So, I think this means, we need to keep the input resolution?

@ValentinaHutter
Copy link
Collaborator

@ValentinaHutter One other concern, we've no way or reliably enforcing the convention of the user specifying the target resolution in the units of the output crs. Scao recommended this could be enforced with accepted rangers for different CRsystems. Example, if the user incorrectly specifies 0.0009 as their output resolution, and the crs is in metres, this will run for longer. Not sure how much longer on dask, so may not be a huge concern.

Hm, it sounds quite complicated to check this with different ranges for every crs, what do you think? Also, the definition says, that the resolution is Specified in the units of the target projection. So, the user should be aware of this, and we can probably assume, that it is in the required unit. Does data.odc.reproject also assume, that the unit of the resolution matches the output crs?

@LukeWeidenwalker LukeWeidenwalker self-requested a review May 31, 2023 12:58
@LukeWeidenwalker LukeWeidenwalker self-assigned this May 31, 2023
@SerRichard
Copy link
Collaborator Author

Using odc.geo 0.4.1, I was unable to reproduce what we saw last month one both fake and real (sen2like output) data. Fixed odc minimum version to 0.4.1, fixed tests and accessor import for odc, and reviewed output arrays with @LukeWeidenwalker!

@LukeWeidenwalker LukeWeidenwalker changed the title Draft: Implementation of resample_spatial to reproject values feat: Implementation of resample_spatial to reproject values Aug 16, 2023
Copy link
Contributor

@LukeWeidenwalker LukeWeidenwalker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Couple of minor comments, but once these are resolved we should be able to finally merge this!

SerRichard and others added 8 commits August 17, 2023 09:02
Co-authored-by: Lukas Weidenholzer <17790923+LukeWeidenwalker@users.noreply.github.com>
Co-authored-by: Lukas Weidenholzer <17790923+LukeWeidenwalker@users.noreply.github.com>
Co-authored-by: Lukas Weidenholzer <17790923+LukeWeidenwalker@users.noreply.github.com>
Co-authored-by: Lukas Weidenholzer <17790923+LukeWeidenwalker@users.noreply.github.com>
…eo-processes-dask into DP-277-add-resample-spatial
@LukeWeidenwalker
Copy link
Contributor

LukeWeidenwalker commented Aug 17, 2023

Ahh, two more things:

  • the spec allows the align parameter, we should at least accept that and log a warning that it doesn't do anything currently edit: we'll just remove this from the spec since we don't support it instead!
  • the spec needs to be updated in openeo-processes

@SerRichard
Copy link
Collaborator Author

spec added in eodcgmbh/openeo-processes

@LukeWeidenwalker LukeWeidenwalker merged commit 7490bc3 into main Aug 17, 2023
5 checks passed
@LukeWeidenwalker LukeWeidenwalker deleted the DP-277-add-resample-spatial branch August 17, 2023 09:16
@clausmichele
Copy link
Member

@SerRichard I have a question: after putting so much work into this process implementation, did you ever test it with an openEO process graph? Because currently it is not imported here https://github.com/Open-EO/openeo-processes-dask/blob/main/openeo_processes_dask/process_implementations/cubes/__init__.py and therefore can't be used.

@clausmichele
Copy link
Member

clausmichele commented Nov 8, 2023

(Edit: my supposition was wrong, see comments below) Additionally, not exposing a parameter like align just because it's not supported it's a bad idea in my opinion.
First of all since it's a parameter with a default value, which could be added by default added to the process graph by the clients (Python client adds it).

@soxofaan does the Python client always add to the process graph all the optional parameters with default values?

@soxofaan
Copy link
Member

soxofaan commented Nov 8, 2023

@soxofaan does the Python client always add to the process graph all the optional parameters with default values?

no we don't always automatically add the default value of unset parameters, it depends a bit on details of DataCube method.

In case of resample_spatial however, we do always add the align parameter with default value "upper-left"

@m-mohr
Copy link
Member

m-mohr commented Nov 9, 2023

Additionally, not exposing a parameter like align just because it's not supported it's a bad idea in my opinion.

Actually, it's the opposite. It's meant to be that if something is not supported, it's not exposed. Process definitions should reflect the implementation status.

In case of resample_spatial however, we do always add the align parameter with default value "upper-left"

I think this should be changed. It should only be sent if not the default value and supported by the back-end.

@clausmichele
Copy link
Member

Thanks @m-mohr, so as you said, the Python Client needs to be adjusted.

@soxofaan
Copy link
Member

soxofaan commented Nov 9, 2023

@jdries
Copy link

jdries commented Nov 9, 2023

IMO, in case of an argument like 'align', where a backend may not support multiple options but does support one of them, why would it not be good to put this in the backend process spec? Listing the single-value in the enum would be a way to convey important information in a standardized manner, and can solve the issue with defaults.

@m-mohr
Copy link
Member

m-mohr commented Nov 9, 2023

Yes, you are right. If one value is supported, the backend should change the schema accordingly so that it requires exactly that value. If a parameter is not supported at all, it should be removed.

That still would require some kind of change in the Python client, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants