-
Notifications
You must be signed in to change notification settings - Fork 1.9k
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Box-Plot with outlier jitter #3148
Comments
This is something you can accomplish with a little pos-hoc artist manipulation: ax = sns.boxplot(data=tips, y="day", x="total_bill", whis=.2)
for artist in ax.lines:
if artist.get_linestyle() == "None":
pos = artist.get_ydata()
artist.set_ydata(pos + np.random.uniform(-.05, .05, len(pos))) Personally, I think this looks a little messy, but YMMV. |
Thanks a lot for that approach. I will use this as a workaround. But please take this Issue as a feature request to Seaborn. Jittered outliers IMHO are a usual and often used case. e.g. ggplot can do this by parameter without workarounds. |
"ggplot parity" is an explicit non-goal of seaborn, but nevertheless this open feature request suggets that might not be the case? tidyverse/ggplot2#4480
I have no idea what this means. Those points look jittered to me. Please supply a reproducible example if you're going to claim that something doesn't work. |
An MWE is not possible because it is not implemented yet. That is why the Issue exists. Look in the figure at "Vals" round about 600. There are multiple outliers. There are much more outliers then between 400 and 500 or at 700. The outliers there should be jittered a bit more from left to right. I don't mean to draw all data points side by side like it is done in a swarm plot. A bit "chaos" is totally OK. But for the viewer there should be an idea about how "big" the chaos is. Technically explained: Sorry my English is not so good to explain it. |
??? What is the code that you used to create the image that you claim has a problem? |
My initial MWE including your workaround
Produce this |
The code I shared is a simple recipe. If you need something more complicated, feel free to expand it. |
Here is a hacky way to work with a swarmplot instead of a stripplot for the outliers: import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
sns.set_theme()
df = pd.DataFrame({'Vals': np.concatenate([np.random.randint(0, 200, size=1000),
np.random.randint(400, 700, size=100),
np.arange(600, 620)])})
df['x'] = np.random.randint(0, 3, len(df))
ax = sns.boxplot(x='x', y='Vals', data=df, orient='v')
xpos = np.array([])
ypos = np.array([])
for line in ax.lines:
if line.get_linestyle() == 'None':
xpos = np.append(xpos, line.get_xdata())
ypos = np.append(ypos, line.get_ydata())
line.remove()
sns.swarmplot(x=xpos, y=ypos, ax=ax, color='red', orient='v')
plt.tight_layout()
plt.show() |
Good thinking — you can actually make this simpler even by just plotting multiple swarms on the fly ax = sns.boxplot(x='x', y='Vals', data=df, orient='v', fliersize=0)
for line in ax.lines:
if line.get_linestyle() == 'None':
sns.swarmplot(x=line.get_xdata(), y=line.get_ydata()) |
That was exactly what I was looking for. Thanks for the workaround. The question now is if this is a candidate for a new feature? No matter when this will be implemented. |
-1 on this, there's already way too many options stuffed into the boxplot API |
Maybe this shouldn't be on Seaborn but on one of the underlying packages? plotly matplotlib or whatever does the magic? |
For magic, you're in Seaborn territory. Seaborn extends matplotlib, Plotly is a very different beast. |
Yes, you'd also need a stat transform that filters to/out outliers. (And a swarm mark since that's apparently what's actually desired here, not jitter). Boxplots are annoying in that they're a "standard" plot type but they're actually quite complicated to make and open the door to all sorts of API complexity. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
What you see in that picture is a workaround for what I really would like to have. When searching the web you often got the combine-boxplot-with-swarmplot-solution. It would IMHO improve seaborn if this could be done via seaborn without a workaround.
The problems with that example are
This is an MWE to reproduce that picture.
The text was updated successfully, but these errors were encountered: