Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update type_histogram.R #288

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

eleuven
Copy link

@eleuven eleuven commented Jan 21, 2025

do not force common break points across facets. this can be done by passing break points to the break option. however, passing a vector to break is currently broken as is.null() is not vectorised.

see here for some background

do not force common break points across facets. this can be done through the break option. however, passing a vector to break is currently broken as is.null() is not vectorised.
Copy link
Owner

@grantmcdermott grantmcdermott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this.

I can see that a couple of tests are failing. I'm not sure if it's because of things that I've pointed out in my quick review of the code or something else. Hopefully I'll have time to investigate properly this evening.

@@ -39,7 +39,7 @@ type_hist = type_histogram
data_histogram = function(breaks = "Sturges") {
hbreaks = breaks
fun = function(by, facet, ylab, col, bg, ribbon.alpha, datapoints, .breaks = hbreaks, ...) {
hbreaks = ifelse(!is.null(.breaks), .breaks, "Sturges")
hbreaks = ifelse(!mapply(is.null, .breaks), .breaks, "Sturges")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mapply is nice and concise. But it's more efficient (safer) to use something like:

!vapply(.breaks, function(x) is.null(x), FUN.VALUE = logical(1)

Or, perhaps even:

any(lengths(.breaks) == 0)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Come to think of it, couldn't we just check the first element of .breaks? I'm trying to think of situations where the first element would be NULL but the others not (or, vice verse)....

Comment on lines -52 to +56
datapoints_breaks = hist(datapoints$x, breaks = hbreaks, plot = FALSE)
datapoints = split(datapoints, list(datapoints$by, datapoints$facet))
datapoints = Filter(function(k) nrow(k) > 0, datapoints)

datapoints = lapply(datapoints, function(k) {
h = hist(k$x, breaks = datapoints_breaks$breaks, plot = FALSE)
h = hist(k$x, breaks = hbreaks, plot = FALSE)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can delete the datapoints_breaks object, since we we should still use it by default in the case of grouped histograms. The thing we want to avoid is different binning widths for each group (unless the user explicitly requests it) which is why we calculate it on the full dataset first.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, but atm there is no possibility of different binning for each group/facet since datapoint_breaks is always passed to breaks, and there is no possibility of passing a vector of breaks because of ifelse(). it would be nice if breaks accepted the same possibilities as hist (function, vector, number).

i think that different binning by group/facet often makes sense when freq=FALSE (missing atm).

thanks for looking into this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants