-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding first gis module to perform geospatial operations, notebook builds #61
Conversation
… hydrological analysis
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
@Zeitsperre @TC-FF |
I mentioned this to @TC-FF in my other Pull Request, but I haven't set things up to translate the fucntions and classes of the library. The reason for this is because docstrings can change much more often than the rest of the documentation. This means that Quickly going over the documentation on @TC-FF Would you want to add a new section in |
Sorry I was not super clear, I meant the documentation for the examples based on the notebooks. I agree that docstrings/classes/fonctions translation is probably too difficult to maintain. But for notebooks and .rst translation, I agree that a new section in |
@Zeitsperre Is is part of dev = [...] or docs = [ ... ] ? Thanks! |
So we listed
If it's exclusively for an
|
It is indeed to use extra functionnalities of
|
That looks good to me! |
@Zeitsperre As an alternative, I was thinking that I could :
In essence, this would reduce memory consumption for the RTD build while hiding the fact that the data is precalculated. We assume that the user will have more than enough (+8Gb) memory when actually testing the notebook in their own environment. As I'm not that familiar with the RTD build, do you know if there is a way (cell tag, metadata, etc.) to hide a cell when bulding the docs while also allowing its execution ? Or alternatively, do you have a better idea that this ? Thanks ! N.B. : The same xdatasets' query runs fine when using a github runner (7 Gb RAM). I'm curious to see what are the specs for the RTD runners. |
I think the total memory allocation for each builder is around 4GB (but it can be bumped up if you send the ReadTheDocs maintainers a message from the admin page of xhydro on RTD with some justifications). Given that it's a completely free service, I don't like the idea of asking for more memory too often (we've been granted more memory for a few projects at Ouranos). If it isn't important to run the notebook on every change to
This approach is a bit problematic since it means that the notebook is never checked at all. An alternative approach could look like:
There are a lot of ways to go about doing this. I've implemented similar approaches in |
@Zeitsperre Looking ahead, we might come across other notebooks that are memory-intensive, making it make sense to switch to Github Workflows. However since this might involve quite a bit of work, we could maybe start by checking out your second suggestion for now :
So, if I get it right, when we run the notebook checking build on Github, it goes through all the notebooks. Does this imply that the notebooks not affected with nbstripout removing outputs (all notebooks except the xagg dependant one, for now) would run twice—once in the Github checking build and again during the ReadTheDocs build (assuming Github check completed successfully) ? |
@Zeitsperre Coming back from the holidays, we can set this up properly! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mes commentaires que j'avais laissés sont encore valides, mais mineurs et devraient pouvoir facilement être réglés. @sebastienlanglois , quand penses-tu avoir le temps de regarder ?
Je profite de la semaine de relâche pour terminer ça! Ce sera fini d'ici la fin de la semaine sans faute! |
@Zeitsperre @RondeauG |
@sebastienlanglois The kernelspec is a weird artifact that notebooks use. I have a hook in a few other projects that simply removes that field (since if it's wrong, that causes errors; no problem if missing). I'll see if I can find the config. |
The last thing we need to fix here is that the notebooks need to be re-run with their outputs saved. ReadTheDocs can't handle running the notebook examples, but if they are saved to the notebooks, they can parse and display them at least. |
Thanks! That's what I figured from our previous discussions so I was surprised that it didn't work out. Now I understand that the culprit was the kernelspec property in the notebook. I will push the notebook with outputs in a few minutes and hopefully, we are done with this PR! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quelques suggestions très mineures, mais ça me semble prêt ! Bravo :)
ESMF_VERSION := $(shell cat $(ESMFMKFILE) | grep "ESMF_VERSION_STRING=" | awk -F= '{print $$2}' | tr -d "'") | ||
install-esmpy: clean ## install esmpy from git based on installed ESMF_VERSION | ||
pip install git+https://github.com/esmf-org/esmf.git@v$(ESMF_VERSION)\#subdirectory=src/addon/esmpy | ||
|
||
install: install-esmpy ## install the package to the active Python's site-packages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Zeitsperre We can leave this here as an additional security, but this should not be required anymore (yay!)
- If
xESMF
is badly installed or initiated,xscen
will simply deactivate the functions using it.xhydro/xdatasets
do not use it at all, for now. esmf/esmpy 8.6.0
has been released and should have fixed that bug too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to, we can remove the install-esmpy
step from install
. This just means that we don't install the GitHub repo version when we run $ make install
. I can see having that disabled as being beneficial for users in systems that don't have grep
, awk
, and tr
installed (powershell
and command prompt
, probably)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's leave that for another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wanted to point out that xdatasets
has xesmf
as a dependency because of xagg
. I had initially created a spinoff xagg
library with no xesmf
dependency (because xesmf
is used for regridding but not for spatial averaging in xagg
, which is the only thing required for xdatasets
) but we now have xagg
as a dependency.
xesmf
is required for the notebooks but might not be for the tests.
Co-authored-by: RondeauG <38501935+RondeauG@users.noreply.github.com>
Co-authored-by: RondeauG <38501935+RondeauG@users.noreply.github.com>
Co-authored-by: RondeauG <38501935+RondeauG@users.noreply.github.com>
Co-authored-by: RondeauG <38501935+RondeauG@users.noreply.github.com>
@sebastienlanglois Il y a eu un mixup dans les triggers pour les tests, mais c'est bon maintenant. Tu peux merger quand tu es prêt ! |
@Zeitsperre @RondeauG I know this PR is closed but just wanted to point out that |
Pull Request Checklist:
number
) and pull request (:pull:number
) has been added.What kind of change does this PR introduce?
This PR adds a GIS module for usual geospatial operations that are common in hydrology such a watershed delineation, watershed properties extraction, etc. It adapts the work that's been done in ravenpy while also adding some new functionalities.
Watershed Delineation
Physiographic Variable (or others) Extraction
Does this PR introduce a breaking change?
No
Other information:
This PR also integrates the changes from #65 and #68