Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change in scale_color_manual behavior in 3.5.x, adds NA to legend #6015

Closed
jeraldnoble opened this issue Jul 26, 2024 · 4 comments
Closed

Change in scale_color_manual behavior in 3.5.x, adds NA to legend #6015

jeraldnoble opened this issue Jul 26, 2024 · 4 comments

Comments

@jeraldnoble
Copy link

Per @teunbrand request to open an issue re: #5214 (comment)

NAs are printed in the legend when using scale_color_manual()

> packageVersion("ggplot2")
[1] ‘3.5.1’

df = data.frame(xvar = c(-3,-2,-1,1,2,3), yvar = c(3,2,1,1,2,3), lab_text = c("-3", "-2", NA, NA, "2", "3"), lab_grp = c("down", "down", NA, NA, "up", "up"))

ggplot(df, aes(x = xvar, y = yvar, color = lab_grp)) + geom_point() + scale_color_manual(values = c("down" = "red", "up" = "blue"))

exPlot

Identical plot made without specifying factors

ggplot(df, aes(x = xvar, y = yvar, color = lab_grp)) + geom_point() + scale_color_manual(values = c("red", "blue"))

Likely caused by change ggplot2::manual_scale in line:
x <- intersect(x, c(names(values), NA)) %||% character()

3.4.x code:

> ggplot2:::manual_scale
function (aesthetic, values = NULL, breaks = waiver(), ..., limits = NULL) 
{
   ...
    if (is.null(limits) && !is.null(names(values))) {
        limits <- function(x) intersect(x, names(values)) %||%     
            character()
    }
   ...
    discrete_scale(aesthetic, "manual", pal, breaks = breaks, 
        limits = limits, ...)
}

3.5.x code:

> ggplot2:::manual_scale
function (aesthetic, values = NULL, breaks = waiver(), name = waiver(), 
        ..., limits = NULL, call = caller_call())
    # omitted for brevity
    if (is.null(limits) && !is.null(names(values))) {
        force(aesthetic)
        limits <- function(x) {
            x <- intersect(x, c(names(values), NA)) %||% character()
            if (length(x) < 1) {
                cli::cli_warn(paste0("No shared levels found between {.code names(values)} of the manual ", 
                  "scale and the data's {.field {aesthetic}} values."))
            }
            x
        }
    }
    # omitted for brevity
    discrete_scale(aesthetic, name = name, palette = pal, breaks = breaks, 
        limits = limits, call = call, ...)
}
@teunbrand
Copy link
Collaborator

I think this issue boils down to the following question: is it better to take names of the values argument as the verbatim limits, or are these a heuristic to pair names to values? If the names should be the literal limits, then NA should not appear in the legend by default. If they are a heuristic, then NA should appear in the legend by default. I think neither approach is 'wrong' per se, but my opinion isn't set in stone.

@clauswilke
Copy link
Member

My opinion, take it for nothing more than that: The plot produced by 3.5.1 seems reasonable for the input data. There are points with no lab_grp value, those points are shown in gray, and the legend explains what that gray means. It's maybe a change from prior behavior but if so seems the correct change to me.

If somebody doesn't want the NA to show up in the legend they can always suppress it, but having it show by default seems good to me.

@jeraldnoble
Copy link
Author

I don't think either option is "right". I am pointing out a change in behavior in 3.5.x within ggplot2::manual_scale. NA groups can be omitted by defining limits() but this was not required in previous versions. On my end, I have to go back and edit ~15 scripts I use to generate reports to include limits being defined after updating to 3.5.x. Feel free to close as this does not break any functionality, but does slightly impair reproducibility between 3.4.x and 3.5.x.

@teunbrand
Copy link
Collaborator

having it show by default seems good to me.

I concur. It is probably better to have heuristics reflect more of the data than less.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants