Clipping levels function #103

csptvlt · 2021-03-05T17:58:24Z

I was trying the clipping detection functions, and clipping.levels shows a behavior that I was not expecting.
The default number of levels is 2, and in my case (see figure below, a small part of the used data for a full year) it identifies night periods as clipping. If I force levels=1, clipping is being identified only during the night because it's the interval with the most data points (?).

Am I using the function wrong? Or should there be, for example, a previous step where daytime is identified?

The text was updated successfully, but these errors were encountered:

cwhanse · 2021-03-05T20:44:39Z

Or should there be, for example, a previous step where daytime is identified?

Short answer: Yes.

An explanation:

The clipping.levels function is looking for peaks in a histogram of the data. It doesn't filter nighttime periods. So when nighttime periods are included in the time series, these periods will create the highest peak at power=0 in the histogram.

The reason that clipping.levels doesn't filter nighttime periods is to keep the function to one task: detecting levels in the data. There are functions in features.daytime that can help label day/night periods if you don't already have an indicator (power, solar angle) that you are confident in.

We designed pvanalytics to emphasize re-use, which is the reason for the one-function, one task rule.

We have thought about providing pre-built workflows (sequences of functions) such as what I think your case needs: first filter night time periods, then apply clipping.levels. I am interested to hear your thoughts about the value of pre-built workflows.

camsilva · 2021-03-07T12:06:26Z

Great, understood, the re-use is indeed relevant.

The workflows you mention is something I can see having a good value. Small workflows like the one we were discussing would be valuable, at least if we know that a feature or a metric will be greatly affected when a specific function is not used in a previous step, which is the case for clipping in this case.

In an extreme case, using data through pvanalytics with a final purpose (calculation of some metric, for example) will probably need a previous focus on a set of problems (gaps, consistency, filtering, inference/imputation) in most cases, which some of them you already have some solutions here, and maybe there are more general workflows that could be thought of. I don't know if this is feasible or if it belongs to the scope of the project, but I certainly see value in it. (https://onlinelibrary.wiley.com/doi/10.1002/pip.3349)

On another note, I saw issue #68, and having functions for parsing names of the physical values measured could probably lead to some semi-automated processes, at least for functions that refer to physical limits/consistency/inference, which could also fall on this workflows topic.

(now writing from my main account)

cwhanse · 2021-03-08T16:11:27Z

@camsilva thanks for sharing your views.

I am picturing a layer of functions built on top of the basic clipping library, something in the spirit of

def label_clipping(data, how='method name', filters={'night':True, 'outliers': False, ...})

where the filters argument controls any subsetting of the data prior to applying the clipping detection method.

wfvining · 2021-03-08T17:12:42Z

It might fill some of the gap, for these kind of small workflows, if we added a "cookbook" to the docs. I think it is still early days and without a solid footing, building an API like @cwhanse suggests could be hard to get right.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clipping levels function #103

Clipping levels function #103

csptvlt commented Mar 5, 2021 •

edited

Loading

cwhanse commented Mar 5, 2021 •

edited

Loading

camsilva commented Mar 7, 2021 •

edited

Loading

cwhanse commented Mar 8, 2021

wfvining commented Mar 8, 2021

Clipping levels function #103

Clipping levels function #103

Comments

csptvlt commented Mar 5, 2021 • edited Loading

cwhanse commented Mar 5, 2021 • edited Loading

camsilva commented Mar 7, 2021 • edited Loading

cwhanse commented Mar 8, 2021

wfvining commented Mar 8, 2021

csptvlt commented Mar 5, 2021 •

edited

Loading

cwhanse commented Mar 5, 2021 •

edited

Loading

camsilva commented Mar 7, 2021 •

edited

Loading