-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Throw an error or warning when mapping to a variable not in data
(if length > 1)
#5391
Comments
Can I vehemently vote against this? In the modern tidyverse version of aes(), it is consistent that expressions in aes() follow the same rules as other functions that use data masking. Having aes() be different would be surprising. Plus this would add syntactic noise to valid constructions (eg any expressions involving a data column and an external variable, like a constant). |
Constants (length 1 vectors) should be allowed since that's a natural use case. I don't understand what you mean by |
I'd argue there are cases where non-constants might be reasonable as well (maybe that's controversial). But also, if the condition becomes "external variables as long as they're length 1", that makes the rules for understanding how to construct valid expressions in aes() more complex. Personally I find that trying to guess the user's intent through complex heuristics, despite good intentions, often leads to harder-to-use APIs in practice, because your users have to learn those heuristics. Better to establish a simple set of rules (which yes, must also be learned --- but hopefully are more predictable) and apply them consistently.
I mean that |
I think the trade-off here isn't worth it. There are valid use-cases for using orphan variables that don't exist in a data.frame in the mapping. I'd also argue that the beginners aren't wrong per se; there are just cleaner ways that are recommended. |
Just to give a brief example of a valid use-case: ggplot(mtcars, aes(disp, mpg)) +
geom_text(aes(label = rownames(mtcars))) Arguably one should make the rownames a proper column in the data, but we don't live in a perfect world, so it is convenient to make it slightly less of a struggle. |
Sorry Elio I'm going to close this issue as it is unlikely that ggplot2 is going to change this behaviour. Thanks for your ideas though! |
Would it be possible/desirable to make this kind of usage a warning or an error?
I've seen many beginner write this mistake. They create a new variable outside the data.frame and then use it as data. It would be very helpful if ggplot2 could detect it and issue a warning.
I'm not sure if this would break a lot of existing code. In case there are legitimate uses, maybe ggplot2 could export a particular function to let users explicitly use external variables along these lines:
Something like this could allow advanced usages that might not be possible otherwise and guard regular users against this common issue.
The text was updated successfully, but these errors were encountered: