Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loess smothing breaks on array with nearly all zeros #1774

Open
1 of 2 tasks
juliettelavoie opened this issue Jun 10, 2024 · 0 comments
Open
1 of 2 tasks

loess smothing breaks on array with nearly all zeros #1774

juliettelavoie opened this issue Jun 10, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@juliettelavoie
Copy link
Contributor

juliettelavoie commented Jun 10, 2024

Setup Information

  • Xclim version: 0.49.0
  • Python version: 3.11
  • Operating System: linux
    env: xscen-0.9

Description

I am trying to do a loess smoothing on a array that is nearly all zeros and I get the following error.

LinAlgError                               Traceback (most recent call last)
Cell In[9], line 13
      1 ds= xr.DataArray(data=[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
      2        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
      3        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
   (...)
     10                 dims=['time'],
     11                 coords={'time':pd.date_range("1956-01-01", periods=145, freq='YS')})
     12 ds
---> 13 out=loess.loess_smoothing(ds)

File ~/Projets/xclim/xclim/sdba/loess.py:260, in loess_smoothing(da, dim, d, f, niter, weights, equal_spacing, skipna)
    257 else:
    258     dx = 0
--> 260 return xr.apply_ufunc(
    261     _loess_nb,
    262     x,
    263     da,
    264     input_core_dims=[[dim], [dim]],
    265     output_core_dims=[[dim]],
    266     vectorize=True,
    267     kwargs={
    268         "f": f,
    269         "weight_func": weight_func,
    270         "niter": niter,
    271         "reg_func": reg_func,
    272         "dx": dx,
    273         "skipna": skipna,
    274     },
    275     dask="parallelized",
    276     output_dtypes=[float],
    277 )

File /usr/local/DEV/mambaforge/envs/xscen-0.9/lib/python3.11/site-packages/xarray/core/computation.py:1270, in apply_ufunc(func, input_core_dims, output_core_dims, exclude_dims, vectorize, join, dataset_join, dataset_fill_value, keep_attrs, kwargs, dask, output_dtypes, output_sizes, meta, dask_gufunc_kwargs, on_missing_core_dim, *args)
   1268 # feed DataArray apply_variable_ufunc through apply_dataarray_vfunc
   1269 elif any(isinstance(a, DataArray) for a in args):
-> 1270     return apply_dataarray_vfunc(
   1271         variables_vfunc,
   1272         *args,
   1273         signature=signature,
   1274         join=join,
   1275         exclude_dims=exclude_dims,
   1276         keep_attrs=keep_attrs,
   1277     )
   1278 # feed Variables directly through apply_variable_ufunc
   1279 elif any(isinstance(a, Variable) for a in args):

File /usr/local/DEV/mambaforge/envs/xscen-0.9/lib/python3.11/site-packages/xarray/core/computation.py:316, in apply_dataarray_vfunc(func, signature, join, exclude_dims, keep_attrs, *args)
    311 result_coords, result_indexes = build_output_coords_and_indexes(
    312     args, signature, exclude_dims, combine_attrs=keep_attrs
    313 )
    315 data_vars = [getattr(a, "variable", a) for a in args]
--> 316 result_var = func(*data_vars)
    318 out: tuple[DataArray, ...] | DataArray
    319 if signature.num_outputs > 1:

File /usr/local/DEV/mambaforge/envs/xscen-0.9/lib/python3.11/site-packages/xarray/core/computation.py:825, in apply_variable_ufunc(func, signature, exclude_dims, dask, output_dtypes, vectorize, keep_attrs, dask_gufunc_kwargs, *args)
    820     if vectorize:
    821         func = _vectorize(
    822             func, signature, output_dtypes=output_dtypes, exclude_dims=exclude_dims
    823         )
--> 825 result_data = func(*input_data)
    827 if signature.num_outputs == 1:
    828     result_data = (result_data,)

File /usr/local/DEV/mambaforge/envs/xscen-0.9/lib/python3.11/site-packages/numpy/lib/function_base.py:2372, in vectorize.__call__(self, *args, **kwargs)
   2369     self._init_stage_2(*args, **kwargs)
   2370     return self
-> 2372 return self._call_as_normal(*args, **kwargs)

File /usr/local/DEV/mambaforge/envs/xscen-0.9/lib/python3.11/site-packages/numpy/lib/function_base.py:2365, in vectorize._call_as_normal(self, *args, **kwargs)
   2362     vargs = [args[_i] for _i in inds]
   2363     vargs.extend([kwargs[_n] for _n in names])
-> 2365 return self._vectorize_call(func=func, args=vargs)

File /usr/local/DEV/mambaforge/envs/xscen-0.9/lib/python3.11/site-packages/numpy/lib/function_base.py:2446, in vectorize._vectorize_call(self, func, args)
   2444 """Vectorized call to `func` over positional `args`."""
   2445 if self.signature is not None:
-> 2446     res = self._vectorize_call_with_signature(func, args)
   2447 elif not args:
   2448     res = func()

File /usr/local/DEV/mambaforge/envs/xscen-0.9/lib/python3.11/site-packages/numpy/lib/function_base.py:2486, in vectorize._vectorize_call_with_signature(self, func, args)
   2483 nout = len(output_core_dims)
   2485 for index in np.ndindex(*broadcast_shape):
-> 2486     results = func(*(arg[index] for arg in args))
   2488     n_results = len(results) if isinstance(results, tuple) else 1
   2490     if nout != n_results:

File /usr/local/DEV/mambaforge/envs/xscen-0.9/lib/python3.11/site-packages/numba/np/linalg.py:827, in _check_finite_matrix()
    825 for v in np.nditer(a):
    826     if not np.isfinite(v.item()):
--> 827         raise np.linalg.LinAlgError(
    828             "Array must not contain infs or NaNs.")

LinAlgError: Array must not contain infs or NaNs.

I think the issue stems from here:

xres = residuals / (6.0 * s)

residual =0 because the fit is too good. Then, dividing by 0 on line 167 creates a inf. On the next iteration, w has nans and breaks on line 162.

I'm not quite sure how the weigths work and what would be an appropriate way to fix that...

Steps To Reproduce

import xclim as xc
import xarray as xr
import pandas as pd
from xclim.sdba import loess

ds= xr.DataArray(data=[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 3., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 3., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0.],
                dims=['time'],
                coords={'time':pd.date_range("1956-01-01", periods=145, freq='YS')})
ds
out=loess.loess_smoothing(ds)

Additional context

I am using loess instead of a 4th degree polynomial to extract the forced response when calculating partitioning on different indicators. The timeseries in the example above is heat_wave_total_length in Gaspe.

Contribution

  • I would be willing/able to open a Pull Request to address this bug.

Code of Conduct

  • I agree to follow this project's Code of Conduct
@juliettelavoie juliettelavoie added the bug Something isn't working label Jun 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant