Releases: mjskay/ggdist
ggdist 3.3.2
Major changes:
- The
geom_slabinterval()
andgeom_dotsinterval()
families gain "sub-guides",
which can be passed to thesubguide
parameter to create axis annotations for
thethickness
aesthetic (for slabs) and the dot count (for dots) (#183). - The
weight
aesthetic is now supported instat_slabinterval()
, including
weighted calculations for densities, CDFs, all interval types (quantile
intervals, highest density intervals, and highest density continuous intervals),
and all point summaries (mean, median, and mode) (#41). This includes support
for the upcoming weighted random variable type in the posterior package. - Blurry dotplots are now supported using
geom_blur_dots()
, which accepts an
sd
aesthetic to set the standard deviation of the blur on each dot.
Intervals can also be used in place of blur by passingblur = "interval"
.
This geom is used by the newstat_mcse_dots()
to show quantiles along with
their error using blur (#63). - The new
breaks_quantiles()
histogram breaks function allows the construction
of quantile histograms withdensity_histogram()
,stat_histinterval()
, etc. - The color ramp scales (e.g.
scale_colour_ramp_continuous()
, ...) now use
an explicit data type,partial_colour_ramp()
, to encode color ramps and
their origin colors, and provide theramp_colours()
function for applying
colour ramps. This should make it easier to pass explicit color ramps
without using scale functions, and for packages building on {ggdist} to
use the colour ramp scales (#209).
Minor changes:
- The default histogram bin selection algorithm is now
"Scott"
instead of
"Sturges"
, as"Sturges"
tends to be too conservative (#214). - The
at
parameter tostat_spike()
(or its names) now determines values of
anat
computed variable, which can be mapped onto aesthetics viaafter_stat()
to more easily label spikes. (#203; thanks @mattansb for the suggestion). - The
arrow
parameter is now supported for intervals ingeom_slabinterval()
(#206; thanks to @ASKurz for the suggestion). - The default value of
overflow
ingeom_dotsinterval()
is now the new
"warn"
mode, which works the same as"keep"
except that it warns users
if the dots will overflow the geometry bounds and suggests solutions (#213). - Optional arguments to automatically partially-applied functions can now be
passed awaiver()
to use their default value (seeauto_partial()
). - Several dependency reductions: removed {cowplot}, {purrr}, {forcats},
{palmerpenguins}, and {modelr} from Suggests; moved {tidyselect} and {dplyr}
from Imports to Suggests. The latter two are only strictly necessary for
curve_interval()
due to its use of grouped data frames and tidy selection to
specify which columns are conditional and which are joint (the use of grouped
data frames withpoint_interval()
is less strictly necessary, and not used
by stats, so is easier to avoid as an absolute dependency).
Documentation:
- The pkgdown documentation now includes an online article on the
thickness
aesthetic with comprehensive examples of how slab scaling works (#205).
Bug fixes:
- Ensure
Mode()
works on analytical constant distributions. - Various fixes to ensure compatibility with {ggplot2} 3.5.0.
ggdist 3.3.1
New features and enhancements:
- Use derivatives supplied by transformations in scales >= 1.2.2 to make
transformations of densities more reliable (r-lib/scales#341). - New
layout = "bar"
forgeom_dotsinterval()
that provides better bar
dotplots (with thanks to @Sharoz for feedback; #190). - Bandwidth estimators (including the default,
bandwidth_dpi()
) now fall back
tobandwidth_nrd0()
when they fail, with a warning that suggests trying
a dotplot or histogram (as these failures tend to happen on data that is not
a good candidate for a density plot in the first place) (#196). - Much faster (C++) implementation of Wilkinson dotplot binning, especially
for large dotplots.
Bug fixes:
- Ensure
scale_side_mirrored()
supportsstart = "left"
andstart = "right"
- Ensure
geom_spike()
draws the point on the correct end of the line depending onside
. - Future-proof
guide_rampbar()
for ggplot2 > 3.4.2 (#186). Thanks to @teunbrand. - Future-proof some minor tests for ggplot2 > 3.4.2 (#187).
- Allow the
size
aesthetic to be overridden for thegeom_dots()
legend. - Ensure
hdi()
supports constants. (#194)
ggdist 3.3.0
Breaking changes: The following changes, mostly due to new default density
estimators, may cause some plots on sample data to change. Changes should usually
be small, and generally should result in more accurate density estimation. Revert
to the old behavior by setting density = density_unbounded(bandwidth = "nrd0")
.
stat_slabinterval()
now usesdensity_bounded()
as its default density
estimator, which uses a bounded density estimator that also estimates the
bounds of the data. The default bandwidth estimator is also nowbandwidth_dpi()
,
which is the Sheather-Jones direct plug-in estimator (the same as
stats::bw.SJ(..., method = "dpi")
). These changes may cause existing charts
using densities to change; usually only slightly. These changes should be worth
it, as they should drastically improve the accuracy of density estimates,
especially on bounded data, and should have little noticeable impact on densities
on unbounded data.density_bounded()
now estimates bounds from the data when not provided
(i.e. when one ofbounds
isNA
). See thebounder_
functions (e.g.
bounder_cdf()
,bounder_cooke()
) for more on bounds estimation.- Improved
Mode()
andhdi()
estimators based on bounded density estimator.
New features and enhancements:
- Improved
hdci()
estimator using quantile estimation. - Histograms are now implemented using
density_histogram()
, a histogram
density estimator. Finer-grained control of bin positions is now possible
using thebreaks
argument (including the newbreaks_fixed()
for manually-specified
bin widths) and thealign
argument (including the newalign_boundary()
and
align_center()
for choosing how to align bin positions to reference points). (#118) - New
geom_spike()
andstat_spike()
for adding spike annotations to slabs
created withgeom_slabinterval()
orstat_slabinterval()
. See example
invignette("slabinterval")
. (#58, #124) parse_dist()
now outputs distributional objects in a.dist_obj
column in
addition to the name-plus-arguments (.dist
+.args
) format, and these objects respect truncation
parameters from prior specifications. This makes it easier to visualize standard
deviation priors, for example, giving a better solution to #20.marginalize_lkjcorr()
adjusts the.dist_obj
column output byparse_dist()
in addition to the.dist
and.args
columns.geom_lineribbon()
now obeys theorder
aesthetic, allowing you to arbitrarily
set the draw order of ribbons (#171). Enabled by this change,stat_lineribbon()
now setsorder = after_stat(level)
by default, making its draw order more correct
by ensuring all ribbons of the same level are drawn together.- Some improved error messages using
cli
. - Very experimental adaptive KDE is available through the
adapt
parameter;
note that it is unsupported and both the implementation and interface are
highly likely to change.
Deprecations:
- The
slab_type
parameter forstat_slabinterval()
is now deprecated in favor
of mapping the corresponding computed variable (pdf
orcdf
) onto the desired
aesthetic. Forslab_type = "histogram"
, use thepdf
computed variable
combined with the newdensity_histogram()
density estimator (e.g. set
density = "histogram"
). (#165)
Bug fixes:
- Ensure scale transformations work even when no slab is present; e.g. in
stat_interval()
. (#168) - Ensure
curve_interval()
works withposterior::rvar
s. (#158) geom_lineribbon()
draw order is now correct even when some portions of a
ribbon hasNA
widths. (#171)- Improve the appearance of logical fill conditions at bin edges on histograms. (#175)
ggdist 3.2.1
New features and enhancements:
- Support for non-numeric distributions in
stat_slabinterval()
and
stat_dotsinterval()
, includingdist_categorical()
,dist_bernoulli()
,
and the upcomingposterior::rvar_factor()
type. (#108) - Various improvements to dotplot layout in
geom_dotsinterval()
:- new
layout = "hex"
allows a hexagonal circle-packing style layout (#161). - new mechanism for smoothing dotplots using the
smooth
parameter, including
smooth = "bounded"
/smooth = "unbounded"
(for "density dotplots") and
smooth = "discrete"
/smooth = "bar"
(for improved layout of large-n
discrete distributions). (#161) - a better bin/dot-nudging algorithm using constrained optimization (#163)
- new
overlaps = "keep"
option disables bin/dot nudging in"bin"
,"hex"
,
and"weave"
layouts. This meanslayout = "weave"
withoverlaps = "keep"
will yield exact dot positions. (#161) - The
"weave"
layout now works properly withside = "both"
- fixed binning artifacts when there is high density on the edges, particularly
right edges (#144) - use a max
binwidth
of 1 for discrete distributions (#159) - new
overflow = "compress"
allows layouts to be compressed to fit into the
geom bounds if a user-specifiedbinwidth
would otherwise cause the dots
to exceed the geom bounds. (#162)
- new
- Two new shortcut geoms for
geom_dotsinterval()
:geom_swarm()
andgeom_weave()
.
Both can be used to quickly create "beeswarm"-like plots. - A new "mirrored" scale for the
side
aesthetic,scale_side_mirrored()
, makes it
easier to create mirrored slabs and dotplots. (#142) - Custom density estimators can now be used with
stat_slabinterval()
via the
density
argument, including a new bounded density estimator (density_bounded()
).
(#113) - Following the split between
size
andlinewidth
aesthetics in ggplot2 3.4,
the following aesthetics have been updated (#138):interval_size
is nowlinewidth
slab_size
is nowslab_linewidth
- in
geom_slab()
,geom_dots()
, andgeom_lineribbon()
,size
is nowlinewidth
- A new experimental mini domain-specific language for probability expressions
in ggdiststat
s: thePr_()
andp_()
functions can be used to generate
after_stat()
expressions in terms of ggdist computed variables; e.g.
aes(thickness = !!Pr_(X <= x))
maps the CDF of the distribution onto the
thickness
aesthetic;aes(thickness = !!p_(x))
maps the PDF onto the
thickness
aesthetic. (#160) - Several function families in ggdist now use "currying" (automatic partial
function application). These function families partially apply themselves until all
non-optional arguments have been supplied:point_interval()
,smooth_...
,
anddensity_...
. Seehelp("automatic-partial-functions")
. - Performance improvements for
point_interval()
on grouped data frames. (#154)
Documentation:
- Uses of
stat()
have been replaced withafter_stat()
to be consistent with
the deprecation ofstat()
in ggplot2 3.4.
ggdist 3.2.0
New features and enhancements:
- Several computed variables in
stat_slabinterval()
can now be shared across
sub-geometries:- The
.width
andlevel
computed variables can now be used in slab / dots
sub-geometries. These values correspond to the smallest interval computed
in the interval sub-geometry containing that portion of the slab. This
gives a more flexible alternative to usingcut_cdf_qi()
to create densities
filled according to a set of intervals (this approach which also works on
highest-density intervals, whichcut_cdf_qi()
does not). Examples in
vignette("slabinterval")
have been updated to use the new approach, and
an example has been added tovignette("dotsinterval")
showing how to
color dots by intervals. - As an experimental feature (currently a bit fragile) enabled via
options(ggdist.experimental.slab_data_in_intervals = TRUE)
,
thepdf
andcdf
computed variables can now be used in interval
sub-geometries to get the PDF and CDF at the point summary.pdf_min
,
pdf_max
,cdf_min
, andcdf_max
also give the PDF and CDF at the lower
and upper ends of the interval. An example invignette("lineribbon")
shows how to use this to make lineribbon gradients whose color approximates
density (as opposed to the classic gradient fan chart examples already
in that vignette, where color approximates the CDF).
- The
scale_thickness_shared()
is now provided to allow the thickness scale to be
shared across geometries, making certain plot types easier to create
(e.g. plots of prior and posterior densities together). See
vignette("slabinterval")
for an example.- If
thickness
is less than 0 it is normalized to have a minimum of zero when
normalization is turned on; this makes it easier to use slab functions that
go below zero. A new example invignette("slabinterval")
shows how to use
this to create raindrop plots. - The stacking order of dots within bins for
geom_dotsinterval(layout = "bin")
can now be set using theorder
aesthetic. This makes it possible to create
"stacked" dotplots by mapping a discrete variable onto theorder
aesthetic
(#132). As part of this change,bin_dots()
now maintains the original data
order within bins whenlayout = "bin"
. See an example in
vignette("dotsinterval")
. - A new
verbose = TRUE
flag ingeom_dotsinterval()
outputs the selected
binwidth
in both data units and normalized parent coordinates. This may be
useful if you want to start with an automatically-selected bin width and then
adjust it manually. Though note: if you just want to scale the selected
bin width to fit within a desired area, it is probably better to usescale
,
and if you want to provide constraints on the bin width, you can pass a
2-vector tobinwidth
. - The
expand
argument instat_slabinterval()
can now take a length-two logical
vector to control expansion to the lower and upper limits respectively (#129).
Thanks to @teunbrand. geom_dotsinterval()
now supports thefamily
aesthetic for setting the font
used to display its dots (based on a conversation with @gdbassett).- Experimental
guide_rampbar()
for creating gradient-like legends for
continuous color/fill ramp scales, based onggplot2::guide_colorbar()
.
See an example invignette("lineribbon")
.
Bug fixes:
- If there are
NA
s in thethickness
aesthetic of a slab, these are now
rendered as gaps in the slab (#129). - Fixed the check for empty x/y scales to avoid extending the scale to cover 0/1
when plotting distributional objects whose bulk lies outside that region
(when there is nothing else on the plot).
ggdist 3.1.1
Bug fixes:
- If a string is supplied to the
point_interval
argument ofstat_slabinterval()
,
a function with that name will be searched for in the calling environment and
theggdist
package environment. The latter ensures thatstat
s work when
ggdist is loaded but not attached to the search path (#128).
ggdist 3.1.0
New features and enhancements:
- The
stat_sample_...
andstat_dist_...
families of stats have been merged (#83).- All
stat_dist_...
stats are deprecated in favor of theirstat_...
counterparts,
which now understand thedist
,args
, andarg1
...arg9
aesthetics. xdist
andydist
can now be used in place of thedist
aesthetic to
specify the axis one is mapping a distribution onto (dist
may be
deprecated in the future).- Passing dist-like objects to the
x
ory
aesthetics now raise a helpful
error message suggesting you probably want to usexdist
orydist
. - Restructured internals for stats and geoms makes it much easier to maintain
shortcut geoms and stats, eliminating a large amount of code duplication (#106). - New
expand
parameter tostat_slabinterval()
allows explicitly setting
whether or not the slab is expanded to the limits of the scale (rather than
implicitly setting this based onslab_type
).
- All
- The
point_interval()
family of functions can now be passeddistributional
andposterior::rvar()
objects, meaning that means and modes (in addition
to medians) and highest-density intervals (in addition to quantile intervals)
can now be visualized for analytical distributions.- As part of this, multivariate distribution objects and
rvar
s will generate
a.index
column when passed topoint_interval()
functions (#111).
Based on a suggestion from @mitchelloharawild.
- As part of this, multivariate distribution objects and
- New
stat_ribbon()
provided as a shortcut stat forstat_lineribbon()
with
no line (#48). Also, if you supply only anx
ory
aesthetic to
geom_lineribbon()
, you will get ribbons without a line (#127). - One-sided intervals (i.e. quantiles) can now be calculated using
ul()
(upper
limit) orll()
(lower limit), e.g. withpoint_interval()
explicitly or
viamean_ll()
,median_ll()
,mode_ll()
,mean_ul()
,median_ul()
,
ormode_ul()
(#49). - Constant distributions are now reliably detected in a variety of situations
and rendered as point masses in both density plots and histograms (#103, #32). - Minor improvements and changes to dotplot layouts that may result in minor
changes to the appearance of existing dotplots:- Minor improvements to automatic bin width selection; the maximum
dot stack height should be closer to or equal toscale
more often. - A formerly-internal fudge factor of
1.07
for dot sizes is now exposed as
the default value of thedotsize
parameter instead of being applied
internally. This fudge factor tends (in my opinion) to make dotplots look a
bit better due to the visual distance between circles, but is (I think)
better used as an explicit value than an implicit one, hence the change.
This may create subtle changes to plots that use thedotsize
orstackratio
parameters, but allows those parameters to have a more precise
geometric interpretation.
- Minor improvements to automatic bin width selection; the maximum
Documentation:
- New vignette for the
stat_dotsinterval()
sub-family:vignette("dotsinterval")
(#120). - Vastly improved and expanded documentation for the
stat_slabinterval()
and
geom_slabinterval()
family: each shortcut stat/geom now has its own documentation
page that comprehensively lists all parameters, aesthetics, and computed variables,
including those pulled in via...
from typically-paired geoms. These docs are
auto-generated and should be easy to maintain going forward. (#36) - The
stat_lineribbon()
andgeom_lineribbon()
family now also has separate
documentation pages with a comprehensive listing of aesthetics and parameters (#107). - Ridge plot-like example in
vignette("slabinterval")
using the newexpand
parameter (#115).
Deprecations and removals:
- The
.prob
argument, a long-deprecated alias for.width
, was removed. - The
limits_function
,limits_args
,slab_function
,slab_args
,interval_function
,
andinterval_args
arguments tostat_slabinterval()
were removed: these were
largely internal-use parameters only needed by subclasses of the base class for
creating shortcut stats, yet added a lot of noise to the documentation, so these
were replaced with the$compute_limits()
,$compute_slabs()
, and
$compute_intervals()
methods on the newAbstractStatSlabinterval
internal base class.
Bug fixes:
- Improved handling of
NA
s for analytical distributions. - Fixed bug where within-bin order of dots in dotplots for
"bin"
and"weave"
layouts could be incorrect with aesthetics mapped at a sub-bin level. stackratio
s that are not equal to1
are now accounted for in
find_dotplot_binwidth()
(i.e. automatic dotplot bin width selection).- Ensure distinct fill colors in lineribbons are still treated as distinct for
grouping even if thefill_ramp
aesthetic ramps them to the same color.
ggdist 3.0.1
ggdist 3.0.0
Breaking changes:
- The positioning of
geom_slabinterval()
family geoms when usingposition_dodge()
is now slightly different in order to match up with how other geoms are positioned (#85).
This may slightly change existing charts that useposition = "dodge"
, and in
some cases may cause slabs to be drawn slightly outside plot boundaries, but makes
it much easier to combinegeom_slabinterval()
with other geoms in the expected
way. If dodging more similar to the old approach is needed, use the new
"justification-preserving dodge",position_dodgejust()
, in place ofposition_dodge()
.
New features:
- For
geom_slabinterval()
,side
,justification
, andscale
can now be
used as aesthetics instead of parameters, allowing them to vary across slabs
within the same geom. - Varying
fill
s within a slab ingeom_slabinterval()
can now be drawn as
true gradients rather than segmented polygons in R >= 4.1 by setting
fill_type = "gradient"
. This substantially improves the appearance of
gradient fills in graphics engines that support it (#44). - Improved support for discrete distributions:
stat_dist_slabinterval()
and company now detect discrete distributions and
display them as histograms (#19).geom_dotsinterval()
now adjusts bin widths on discrete distributions when
they would result in bins that are taller than the allocated space to ensure
that they fit within the required space (#42).
- Allow user-specified lower and/or upper bounds on dynamic
geom_dotsinterval()
bin width by passing a vector of two values to thebinwidth
parameter. - The automatic bin selection algorithm used by
geom_dotsinterval()
has been
factored out and exported asfind_dotplot_binwidth()
andbin_dots()
for
others to use (#77). - Previously,
curve_interval()
used a common (but naive) approach to finding
a cutoff on data depth to identify the X% "deepest" curves, simply taking the
envelope around the X% quantile of curves ranked by depth. This is quite
conservative and tends to create intervals that are too wide;curve_interval()
now searches for a cutoff in data depth such that X% of curves are contained
within its envelope (#67). point_interval()
and company now acceptdistributional
objects and
posterior::rvar()
s (full support fordistributional
objects requires
distributional
> 0.2.2).- Reduce dependencies substantially, making the geoms more suitable for use by
other packages (thanks to Brenton Wiernik for the help).
New documentation:
-
Substantial improvements to the documentation of aesthetics and computed
variables ingeom_slabinterval()
,stat_slabinterval()
, and company, listing
all custom aesthetics, computed variables, and their usage. -
Several new examples in
vignette("slabinterval")
, including "rain cloud"
plots and an example of histograms for discrete analytical distributions.
Bug fixes:
ggdist 2.4.1
New features:
- Added
"weave"
and"swarm"
layouts for dots geoms (#64). These provide
alternative layouts that keep datapoints in their actual positions on the
data axis. The"weave"
layout maintains rows but not columns and works well
for quantile dotplots; the"swarm"
layout uses the"compactswarm"
method from
beeswarm::beeswarm()
(courtesy James Trimble) and works well on sample data.
See the dotplot section ofvignette("slabinterval")
for comparisons. - Allow the use of
unit()
to specify bin widths manually for dots geoms and stats,
which can be helpful when you need dotplots across facets to have the same bin width
(#53).
New documentation:
- Add example of lineribbon gradients using
fill_ramp
invignette("lineribbon")
. - Add example of Tukey-like pencils in
vignette("slabinterval")
. - Add example of two slab used together (densities and dotplots to make "rain clouds")
invignette("slabinterval")
.
Bug fixes:
- Fix issues with ggplot2 3.3.4 (#72) and vdiffr 1.0.
- Handle interactions between alpha and fill/color properly when not set by user (#62).
- Use step function for all ECDFs, which should also fix constant CDFs (#55).
- Move fda to suggests as it brings in a large number of dependencies and is rarely used.
- Use trimmed density for mode estimation (#57).