Digory and his Uncle Are Both in Trouble
New Features
-
Added
all_miss()
/all_na()
equivalent toall(is.na(x))
-
Added
any_complete()
equivalent toall(complete.cases(x))
-
Added
any_miss()
equivalent toanyNA(x)
-
Added
common_na_numbers
and finalisedcommon_na_strings
- to provide a
list of commonly used NA values
#168 -
Added
miss_var_which
, to lists the variable names with missings -
Added
as_shadow_upset
which gets the data into a format suitable for
plotting as anUpSetR
plot:airquality %>% as_shadow_upset() %>% UpSetR::upset()
-
Added some imputation functions to assist with exploring missingness
structure and visualisation:impute_below
Perfoms as forshadow_shift
, but performs on all columns.
This means that it imputes missing values 10% below the range of the
data (powered byshadow_shift
), to facilitate graphical exloration of
the data. Closes #145
There are also scoped variants that work for specific named columns:
impute_below_at
, and for columns that satisfy some predicate function:
impute_below_if
.impute_mean
, imputes the mean value, and scoped variants
impute_mean_at
, andimpute_mean_if
.
-
impute_below
andshadow_shift
gain argumentsprop_below
andjitter
to control the degree of shift, and also the extent of jitter. -
Added
complete_{case/var}_{pct/prop}
, which complement
miss_{var/case}_{pct/prop}
#150 -
Added
unbind_shadow
andunbind_data
as helpers to remove shadow columns
from data, and data from shadows, respectively. -
Added
is_shadow
andare_shadow
to determine if something contains a
shadow column. simimlar torlang::is_na
andrland::are_na
,is_shadow
this returns a logical vector of length 1, andare_shadow
returns a logical
vector of length of the number of names of a data.frame. This might be
revisited at a later point (seeany_shade
inadd_label_shadow
). -
Aesthetics now map as expected in geom_miss_point(). This means you can write
things likegeom_miss_point(aes(colour = Month))
and it works appropriately.
Fixed by Luke Smith in Pull request
#144, fixing
#137.
Minor Changes
-
miss_var_summary
andmiss_case_summary
now return useorder = TRUE
by
default, so cases and variables with the most missings are presented in
descending order. Fixes #163 -
Changes for Visualisation:
- Changed the default colours used in
gg_miss_case
andgg_miss_var
to
lorikeet purple (from ochRe package: https://github.com/ropenscilabs/ochRe) gg_miss_case
- The y axis label is now ...
- Default presentation is with
order_cases = TRUE
. - Gains a
show_pct
option to be consistent withgg_miss_var
#153
gg_miss_which
is rotated 90 degrees so it is easier to read variable namesgg_miss_fct
uses a minimal theme and tilts the axis labels
#118.
- Changed the default colours used in
-
imported
is_na
andare_na
fromrlang
. -
Added
common_na_strings
, a list of commonNA
values
#168. -
Added some detail on alternative methods for replacing with NA in the
vignette "replacing values with NA".