Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing data fetching via pooch #62

Merged
merged 17 commits into from
Jan 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,11 @@ repos:
- id: flake8
additional_dependencies: [ 'flake8-alphabetize', 'flake8-rst-docstrings' ]
args: [ '--config=.flake8' ]
- repo: https://github.com/numpy/numpydoc
rev: v1.6.0
hooks:
- id: numpydoc-validation
exclude: 'tests|docs/conf.py'
- repo: https://github.com/keewis/blackdoc
rev: v0.3.9
hooks:
Expand Down
14 changes: 14 additions & 0 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,20 @@ Contributors to this version: Trevor James Smith (:user:`Zeitsperre`), Thomas-Ch
New features and enhancements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* Added French language support to the documentation. (:issue:`53`, :pull:`55`).
* Added a new set of functions to support creating and updating `pooch` registries, caching testing datasets from `hydrologie/xhydro-testdata`, and ensuring that testing datasets can be loaded into temporary directories.
* `xhydro` is now configured to use `pooch` to download and cache testing datasets from `hydrologie/xhydro-testdata`. (:pull:`62`).

Breaking changes
^^^^^^^^^^^^^^^^
* Added `pooch` as an installation dependency. (:pull:`62`).

Internal changes
^^^^^^^^^^^^^^^^
* Added a new module for testing purposes: `xhydro.testing.helpers` with some new functions. (:pull:`62`):
* `generate_registry`: Parses data found in package (`xhydro.testing.data`), and adds it to the `registry.txt`
* `load_registry`: Loads installed (or custom) registry and returns dictionary
* `populate_testing_data`: Fetches the registry and optionally caches files at a different location (helpful for `pytest-xdist`).
* Added a `pre-commit` hook (`numpydoc`) to ensure that `numpy` docstrings are formatted correctly. (:pull:`62`).

v0.3.0 (2023-12-01)
-------------------
Expand Down
14 changes: 14 additions & 0 deletions CONTRIBUTING.rst
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,14 @@ Ready to contribute? Here's how to set up ``xhydro`` for local development.
# Or, to run multiple build tests
$ tox

.. note::

Running `pytest` or `tox` will automatically fetch and cache the testing data for the package to your local cache (using the `platformdirs` library). On Linux, this is located at ``XDG_CACHE_HOME`` (usually ``~/.cache``). On Windows, this is located at ``%LOCALAPPDATA%`` (usually ``C:\Users\username\AppData\Local``). On MacOS, this is located at ``~/Library/Caches``.

If for some reason you wish to cache this data elsewhere, you can set the ``XHYDRO_DATA_DIR`` environment variable to a different location before running the tests. For example, to cache the data in the current working directory, run::

$ export XHYDRO_DATA_DIR=$(pwd)/.cache

#. Commit your changes and push your branch to GitHub::

$ git add .
Expand Down Expand Up @@ -134,6 +142,12 @@ Ready to contribute? Here's how to set up ``xhydro`` for local development.

You will have contributed your first changes to ``xhydro``!

.. warning::

If your Pull Request relies on modifications to the testing data of `xhydro`, you will need to update the testing data repository as well. As a preliminary testing measure, the branch of the testing data can be modified at testing time (from `main`) by setting the ``XHYDRO_TESTDATA_BRANCH`` environment variable to the branch name of the ``xhydro-testdata`` repository.

Be sure to consult the ReadMe found at https://github.com/hydrologie/xhydro-testdata as well.

Pull Request Guidelines
-----------------------

Expand Down
2 changes: 2 additions & 0 deletions environment-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ dependencies:
# Don't forget to sync changes between environment.yml, environment-dev.yml, and pyproject.toml!
# Main packages
- numpy
- pooch >=1.8.0
- pydantic >=2.0,<2.5.3 # FIXME: Remove pin once our dependencies (xclim, xscen) support pydantic 2.5.3
- statsmodels
- xarray
- xclim >=0.45.0
Expand Down
2 changes: 2 additions & 0 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ dependencies:
# Don't forget to sync changes between environment.yml, environment-dev.yml, and pyproject.toml!
# Main packages
- numpy
- pooch >=1.8.0\
- pydantic >=2.0,<2.5.3
- statsmodels
- xarray
- xclim >=0.45.0
Expand Down
24 changes: 22 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@ dynamic = ["description", "version"]
dependencies = [
# Don't forget to sync changes between environment.yml, environment-dev.yml, and pyproject.toml!
"numpy",
"pooch>=1.8.0",
"pydantic>=2.0,<2.5.3",
"statsmodels",
"xarray",
"xclim>=0.45.0",
Expand Down Expand Up @@ -146,7 +148,8 @@ include = [
"docs/make.bat",
"tests/*.py",
"tox.ini",
"xhydro"
"xhydro",
"xhydro/testing/registry.txt"
]
exclude = [
"*.py[co]",
Expand All @@ -161,7 +164,8 @@ exclude = [
"Makefile",
"docs/_*",
"docs/apidoc/modules.rst",
"docs/apidoc/xhydro*.rst"
"docs/apidoc/xhydro*.rst",
"xhydro/testing/data/*"
]

[tool.isort]
Expand All @@ -178,6 +182,22 @@ warn_unused_configs = true
module = []
ignore_missing_imports = true

[tool.numpydoc_validation]
checks = [
"all", # report on all checks, except the below
"ES01",
"EX01",
"GL01",
"SA01"
]
exclude = [
# don't report on objects that match any of these regex
'\.undocumented_method$',
'\.__repr__$',
# any object starting with an underscore is a private object
'\._\w+'
]

[tool.pytest.ini_options]
addopts = [
"--verbose",
Expand Down
1 change: 1 addition & 0 deletions tox.ini
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ setenv =
PYTHONPATH = {toxinidir}
passenv =
CI
ESMFMKFILE
COVERALLS_*
GITHUB_*
extras =
Expand Down
17 changes: 15 additions & 2 deletions xhydro/cc.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
"""Module to compute climate change statistics using xscen functions."""
import xarray

# Special imports from xscen
from xscen import ( # FIXME: To be replaced with climatological_op once available
Expand All @@ -17,8 +18,20 @@


# FIXME: To be deleted once climatological_op is available in xscen
def climatological_op(ds, **kwargs):
"""Compute climatological operation.
def climatological_op(ds: xarray.Dataset, **kwargs: dict) -> xarray.Dataset:
r"""Compute climatological operation.

Parameters
----------
ds : xarray.Dataset
Input dataset.
\*\*kwargs : dict
Keyword arguments passed to :py:func:`xscen.aggregate.climatological_mean`.

Returns
-------
xarray.Dataset
Output dataset.

Notes
-----
Expand Down
31 changes: 16 additions & 15 deletions xhydro/indicators.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,36 +64,37 @@ def get_yearly_op(
missing_options: Optional[dict] = None,
interpolate_na: bool = False,
) -> xr.Dataset:
"""
Compute yearly operations on a variable.
"""Compute yearly operations on a variable.

Parameters
----------
ds: xr.Dataset
ds : xr.Dataset
Dataset containing the variable to compute the operation on.
op: str
op : str
Operation to compute. One of ["max", "min", "mean", "sum"].
input_var: str
input_var : str
Name of the input variable. Defaults to "streamflow".
window: int
window : int
Size of the rolling window. A "mean" operation is performed on the rolling window before the call to xclim.
This parameter cannot be used with the "sum" operation.
timeargs: dict, optional
timeargs : dict, optional
Dictionary of time arguments for the operation.
Keys are the name of the period that will be added to the results (e.g. "winter", "summer", "annual").
Values are up to two dictionaries, with both being optional.
The first is {'freq': str}, where str is a frequency supported by xarray (e.g. "YS", "AS-JAN", "AS-DEC").
It needs to be a yearly frequency. Defaults to "AS-JAN".
The second is an indexer as supported by :py:func:`xclim.core.calendar.select_time`. Defaults to {}, which means the whole year.
The second is an indexer as supported by :py:func:`xclim.core.calendar.select_time`.
Defaults to {}, which means the whole year.
See :py:func:`xclim.core.calendar.select_time` for more information.
Examples: {"winter": {"freq": "AS-DEC", "date_bounds": ['12-01', '02-28']}}, {"jan": {"freq": "YS", "month": 1}}, {"annual": {}}.
missing: str
Examples: {"winter": {"freq": "AS-DEC", "date_bounds": ["12-01", "02-28"]}}, {"jan": {"freq": "YS", "month": 1}}, {"annual": {}}.
missing : str
How to handle missing values. One of "skip", "any", "at_least_n", "pct", "wmo".
See :py:func:`xclim.core.missing` for more information.
missing_options: dict, optional
missing_options : dict, optional
Dictionary of options for the missing values' method. See :py:func:`xclim.core.missing` for more information.
interpolate_na: bool
Whether to interpolate missing values before computing the operation. Only used with the "sum" operation. Defaults to False.
interpolate_na : bool
Whether to interpolate missing values before computing the operation. Only used with the "sum" operation.
Defaults to False.

Returns
-------
Expand All @@ -105,7 +106,6 @@ def get_yearly_op(
-----
If you want to perform a frequency analysis on a frequency that is finer than annual, simply use multiple timeargs
(e.g. 1 per month) to create multiple distinct variables.

"""
missing_options = missing_options or {}
timeargs = timeargs or {"annual": {}}
Expand Down Expand Up @@ -174,7 +174,8 @@ def get_yearly_op(
and freq != "AS-DEC"
):
warnings.warn(
"The frequency is not AS-DEC, but the season indexer includes DJF. This will lead to misleading results."
"The frequency is not AS-DEC, but the season indexer includes DJF. "
"This will lead to misleading results."
)
elif (
"doy_bounds" in indexer.keys()
Expand Down
5 changes: 4 additions & 1 deletion xhydro/testing/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1,4 @@
"""Helpers for testing."""
"""Testing utilities and helper functions."""

from .helpers import *
from .utils import *
Loading