Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to jitter outliers in a boxplot #4480

Open
jolars opened this issue May 17, 2021 · 1 comment
Open

Add option to jitter outliers in a boxplot #4480

jolars opened this issue May 17, 2021 · 1 comment
Labels
feature a feature request or enhancement layers 📈

Comments

@jolars
Copy link

jolars commented May 17, 2021

I would like to be able to add a small amount of jittering to outliers in a boxplot or alternatively stack the points to avoid having them overlap.

Here is an example of where points in a boxplot overlap:

library(ggplot2)
library(dplyr)

# outliers are overlapping
ggplot(mpg, aes(drv, cty)) +
  geom_boxplot()

To add jittering to these outliers, we currently have to result to the following hack, by creating a separate dataset of outliers and plotting them using geom_jitter() manually.

# adding jittering to outliers is a bit of work
outliers <- 
  mpg %>%
  group_by(drv) %>%
  filter(cty > quantile(cty, 0.75) + 1.5 * IQR(cty) | 
           cty < quantile(cty, 0.25) - 1.5 * IQR(cty))

ggplot(mpg, aes(drv, cty)) +
  geom_boxplot(outlier.shape = NA) +
  geom_jitter(height = 0, width = 0.1, data = outliers)

I understand that the position argument in geom_boxplot() is already "occupied", so maybe the simplest solution would probably to just add a new argument outlier.jitter = c(0, 0) (for x and y coordinate jittering respectively).

An even better solution would of course be to incorporate the beeswarm algorithm from ggbeeswarm:

library(ggbeeswarm)

ggplot(mpg, aes(drv, cty)) +
  geom_boxplot(outlier.shape = NA) +
  geom_beeswarm(data = outliers)

Created on 2021-05-17 by the reprex package (v2.0.0)

@teunbrand
Copy link
Collaborator

teunbrand commented Aug 3, 2023

I don't think that it would make sense to add this as an option to geom_boxplot().
Instead, I think it would make more sense to make a stat_boxplot_outliers() that you can use with geom_point(), geom_jitter() or ggbeeswarm::geom_beeswarm() to have positions to your liking, or hell, use geom_hex() if you wanted to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement layers 📈
Projects
None yet
Development

No branches or pull requests

3 participants