Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Label dictionary #6077

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

teunbrand
Copy link
Collaborator

This PR aims to fix #5178.

Briefly, it adds labs(dict) that allows one to use a data-dictionary to label a plot based on variable names (rather than aesthetics).

Let's just jump into examples.
The premise of this idea is that somewhere in your analysis code, you have some nice labels about what variables in your dataset mean. For example, we could have the following for the mpg dataset.

devtools::load_all("~/packages/ggplot2")
#> ℹ Loading ggplot2

dict <- c(
  displ = "Engine Displacement",
  hwy   = "Highway miles per gallon",
  cty   = "City miles per gallon",
  drv   = "Drive train",
  manufacturer = "Manufacturer name",
  model = "Model name",
  year  = "Year of manufacture",
  cyl   = "Number of cylinders",
  trans = "Type of transmission",
  fl    = "Fuel type",
  class = "Type of car"
)

This PR lets you slap on such a dictionary to your labels, and all variable names will be translated. The benefit is that you only have to think about pretty lables for variables once and you needn't worry about them again.

ggplot(mpg, aes(class, cty, fill = drv)) +
  geom_boxplot() +
  labs(dict = dict)

Noteably, this doesn't work when having more complex expressions, like factor(cyl) instead of cyl. In such case, you can fall back to labelling the aesthetic, or you can add an entry like labs(dict = c(dict, factor(cyl) = dict[["cyl"]])). Also we can reuse the dictionary here because we're using the same dataset even though we're making a totally different plot.

ggplot(mpg, aes(displ, hwy, colour = factor(cyl))) +
  geom_point() +
  labs(dict = dict, colour = dict["cyl"])

Created on 2024-09-04 with reprex v2.1.1

@larmarange
Copy link

This is a nice idea. Would it be better to have a more explicit argument, i.e. dictionary instead of dict?

@teunbrand
Copy link
Collaborator Author

I don't have too strong of an opinion on this, but I like that both labs() is terse and dict is terse.
Perhaps Thomas can render a decision on this one

@teunbrand
Copy link
Collaborator Author

teunbrand commented Sep 10, 2024

Double check: do we overwrite labs if their value is already inside?
EDIT: No, we don't :)
TODO: rename to dictionary

devtools::load_all("~/packages/ggplot2/")
#> ℹ Loading ggplot2

ggplot(mpg, aes(displ, hwy)) +
  geom_point() +
  labs(
    x = "foo",
    dictionary = c(foo = "foobar", hwy = "baz")
  )

Created on 2024-09-10 with reprex v2.1.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Label dictionaries
2 participants