-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there a way to shade or recode_shadow on the whole df? #249
Comments
No currently but that is a great suggestion! |
Thank you for {naniar}, please excuse me for bumping this feature request! A common use case I can see for {naniar} is metabolism panel data where, in wide form, each column is a metabolite, metal, or chemical. These values have particular types of missings called "limit of detection (LOD)" or "limit of quantitation (LOQ)". It would be great if we can do n_people <- 5
n_chemicals <- 5
prob_missing <- 0.5
chemical_names <- paste0("chemical_", seq_len(n_chemicals))
lod_fns <- function(n) {
flip_coins <- runif(n)
value <- round(rnorm(n), 3)
lod_loq <- sample(c("NA_LOD", "NA_LOQ"), size = n, replace = TRUE)
ifelse(flip_coins <= prob_missing, lod_loq, as.character(value))
}
panel_long <- data.frame(
id = rep(seq_len(n_people), n_chemicals),
chemicals = rep(chemical_names, each = n_people),
value = lod_fns(n_people * n_chemicals)
)
panel_wide <- panel_long |>
tidyr::pivot_wider(id_cols = id,
names_from = "chemicals",
values_from = "value")
panel_wide
# id chemical_1 chemical_2 chemical_3 chemical_4 chemical_5
# <int> <chr> <chr> <chr> <chr> <chr>
# 1 1 NA_LOD NA_LOQ NA_LOQ NA_LOQ NA_LOQ
# 2 2 NA_LOQ NA_LOQ NA_LOQ NA_LOQ NA_LOD
# 3 3 -0.843 NA_LOQ -0.275 NA_LOQ -0.767
# 4 4 1 NA_LOD 0.244 0.788 -0.532
# 5 5 -1.823 NA_LOD 0.313 1.426 0.196 For this special type of data, since the chemicals are usually the same type of data. I can see two solutions.
I am not too familiar with {naniar} source codes but I would love to take a crack at this. I would love some pointers to where I should start reading. Cheers! EDIT: typos |
Is there any way to do a
shade()
or arecode_shadow()
on the entire df to handle special missings like -99 for every column? Both seem to only operate on vectors currently.The text was updated successfully, but these errors were encountered: