Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: handle incompatible coords in merge_cubes #148

Merged

Conversation

clausmichele
Copy link
Member

@LukeWeidenwalker, in load_stac, stackstac loads many additional metadata as data variables/coordinates. It is an advantage when we want to have a complete overview on the dataset but in this case creates an issue when trying to merge different datasets, An example is in this notebook I recently created:
https://github.com/clausmichele/openeo-python-client/blob/add_stac_notebook/examples/notebooks/Client_Side_Processing/client_side_processing_STAC_data_fusion.ipynb

This fix removes the unnecessary coordinates if the merge fails and tries to merge the datacubes again. I spent a lot of time trying to find a more elegant solution, but I couldn't.

@codecov
Copy link

codecov bot commented Aug 2, 2023

Codecov Report

Merging #148 (a3d3559) into main (dfee6f6) will not change coverage.
Report is 1 commits behind head on main.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main     #148   +/-   ##
=======================================
  Coverage   78.80%   78.80%           
=======================================
  Files          27       27           
  Lines        1175     1175           
=======================================
  Hits          926      926           
  Misses        249      249           
Files Changed Coverage Δ
...cesses_dask/process_implementations/cubes/merge.py 96.10% <ø> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@LukeWeidenwalker LukeWeidenwalker changed the title Fix case caused by load_stac fix: handle incompatible coords in merge_cubes Aug 3, 2023
@LukeWeidenwalker
Copy link
Contributor

Ah, thanks for reporting this - I had a similar problem in #147, but there I could just use xr.merge, which doesn't seem to have this problem. I'm still a bit confused on why this wouldn't work though, I would have imagined combine_by_coords to just create the missing coord - do you think this is something worth raising with xarray?

Regardless, happy to go for a workaround like you propose - if you could please also add a test case to cover this failure mode, that would be much appreciated - thanks!

@clausmichele
Copy link
Member Author

I tried also to write the two datasets to netCDFs, read them again with xArray and then it worked fine, which is so strange.
I agree that it would be good to create a minimal working example and open an issue for xarray.

@clausmichele
Copy link
Member Author

@LukeWeidenwalker the special case causing issues is when there's an additional coordinate which has different values in the two cubes we want to merge. I found out that adding compat="override" solves the issue without the need to remove all the coordinates that are not in common. I also added a test for this!

@LukeWeidenwalker
Copy link
Contributor

Ah, good find - thanks for the fix!

@LukeWeidenwalker LukeWeidenwalker merged commit f861d49 into Open-EO:main Aug 11, 2023
5 checks passed
@clausmichele clausmichele deleted the fix_merge_cubes_conflict_vars branch November 2, 2023 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants