Add accessors for stats functions #59

bradyrx · 2019-08-30T17:18:25Z

It would be much quicker to have our stats functions registered as accessors as well, as in xskillscore. Then one can just do ds.esm.rm_trend(dim='time') for instance.

The text was updated successfully, but these errors were encountered:

lukegre · 2020-08-03T07:48:19Z

Happy to jump in here if this hasn't been taken on yet

bradyrx · 2020-08-03T14:26:53Z

Yes that would be great @luke-gregor! I think I already have the accessor file set up for the convert_lon function, so feel free to add the stats functions there. You can follow the model set up in xskillscore. I'll look out for a PR. Thanks!

lukegre · 2020-11-04T09:47:44Z

I've finally got some time to get to this! Just want to check that anyone else hasn't started on this in the meantime?

lukegre · 2020-11-04T10:03:43Z

A question regarding usability. I noticed that for the stats regression functions, that x changes from the predicant to the predictor when y is given. Would it not make sense to always have y as the predicant and x as the predictor (if not None)? Seems a little more consistent/transparent.

I can create a new PR or change that in this PR if you agree on this.

esmtools/esmtools/stats.py

Lines 265 to 295 in e93391f

 def linear_slope(x, y=None, dim='time', nan_policy='none'): 

 """Returns the linear slope with y regressed onto x. 

  .. note:: 

  This function will try to infer the time freqency of sampling if ``x`` is in 

  datetime units. The final slope will be returned in the original units per 

  that frequency (e.g. SST per year). If the frequency cannot be inferred 

  (e.g. because the sampling is irregular), it will return in the original 

  units per day (e.g. SST per day). 

  Args: 

  x (xarray object): Independent variable (predictor) for linear regression. 

  If ``y`` is ``None``, treat ``x`` as the dependent variable and remove 

  slope over ``dim``. 

  y (xarray object, optional): Dependent variable (predictand) for linear 

  regression. If ``None``, treat ``x`` as the predictand. 

  dim (str, optional): Dimension to apply linear regression over. 

  Defaults to "time". 

  nan_policy (str, optional): Policy to use when handling nans. Defaults to 

  "none". 

  * 'none', 'propagate': If a NaN exists anywhere on the given dimension, 

  return nans for that whole dimension. 

  * 'raise': If a NaN exists at all in the datasets, raise an error. 

  * 'omit', 'drop': If a NaN exists in `x` or `y`, drop that index and 

  compute the slope without it. 

  Returns: 

  xarray object: Slopes computed through a least-squares linear regression. 

  """

aaronspring · 2020-11-04T10:53:23Z

I wanted to have the current implementation, because I wanted to do linear_slope(ds, dim='time'). I'd be happy to keep that way of using it. maybe having the accessor like ds.esmtools.liner_slope(dim='time')?

lukegre · 2020-11-04T12:59:49Z

Changing this shouldn't change the functionality at all. Just means that y will always be the first input argument. Would this make sense? This is a slight API change though, so might break code for people who have already implemented linear_slope(x, y=y, dim='time')

aaronspring · 2020-11-04T13:40:16Z

Ah ok. Please go ahead with a PR. Sounds good to me

bradyrx · 2020-11-13T21:04:40Z

@luke-gregor, my thinking was that in logical order it makes sense to have x first, then y. I know it's sort of weird to have a singular argument revert to the predictand, since it's technically in the slot order of the predictor otherwise (if I understand your point correctly).

I would say it's intuitively off to have (y, x=None) order, although the language is "y regressed onto x" so this could make sense. In the end, my defense is that I modeled it after scipy conventions: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.linregress.html. I could see it going either way, but I see in the PR you got roadblocked by the thorough testing so it seems like we'll keep the order for now?

lukegre · 2020-11-17T08:15:32Z

I could see it going either way, but I see in the PR you got roadblocked by the thorough testing so it seems like we'll keep the order for now?

Yep, exactly. Didn't realise how extensive and impressive the test suite for esmtools is!

bradyrx added enhancement labels Aug 30, 2019

bradyrx removed enhancement labels Nov 14, 2019

aaronspring assigned lukegre Aug 3, 2020

aaronspring added the useability label Aug 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add accessors for stats functions #59

Add accessors for stats functions #59

bradyrx commented Aug 30, 2019

lukegre commented Aug 3, 2020

bradyrx commented Aug 3, 2020

lukegre commented Nov 4, 2020

lukegre commented Nov 4, 2020 •

edited

Loading

aaronspring commented Nov 4, 2020

lukegre commented Nov 4, 2020 •

edited

Loading

aaronspring commented Nov 4, 2020

bradyrx commented Nov 13, 2020

lukegre commented Nov 17, 2020

Add accessors for stats functions #59

Add accessors for stats functions #59

Comments

bradyrx commented Aug 30, 2019

lukegre commented Aug 3, 2020

bradyrx commented Aug 3, 2020

lukegre commented Nov 4, 2020

lukegre commented Nov 4, 2020 • edited Loading

aaronspring commented Nov 4, 2020

lukegre commented Nov 4, 2020 • edited Loading

aaronspring commented Nov 4, 2020

bradyrx commented Nov 13, 2020

lukegre commented Nov 17, 2020

lukegre commented Nov 4, 2020 •

edited

Loading

lukegre commented Nov 4, 2020 •

edited

Loading