Skip to content

Commit

Permalink
Merge branch 'devel' into 7-function-documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
elimillera committed Jun 15, 2023
2 parents 5630317 + 4692171 commit 191c885
Show file tree
Hide file tree
Showing 33 changed files with 950 additions and 160 deletions.
2 changes: 2 additions & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,5 @@
^advs\.xpt$
^advs_Define-Excel-Spec_match_admiral\.xlsx
^cran-comments\.md$
^example_data_specs$

2 changes: 2 additions & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -81,3 +81,5 @@ Suggests:
metacore
Config/testthat/edition: 3
VignetteBuilder: knitr
Depends:
R (>= 3.5)
15 changes: 10 additions & 5 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,26 @@

## New Features and Bug Fixes

* Fixed an issue where `xportr_type` would overwrite column labels, widths, and "sas.formats"
* Fixed messaging of `xportr_order`to give better visability of the number of variables being reordered.
* Add new argument to `xportr_write` to allow users to specify how xpt validation checks are handled.
* Fixed an issue where `xportr_type()` would overwrite column labels, widths, and "sas.formats"
* Fixed messaging of `xportr_order()`to give better visibility of the number of variables being reordered.
* Add new argument to `xportr_write()` to allow users to specify how xpt validation checks are handled.
* Fixed bug where character_types were case sensitive. They are now case insensitive.
* Updated `xportr_type` to make type coercion more explicit.
* Updated `xportr_type()` to make type coercion more explicit.
* `xpt_validate` updated to accept iso8601 date formats. (#76)
* Added function `xportr_metadata()` to explicitly set metadata at the start of a pipeline (#44)
* Metadata order columns are now coerced to numeric by default in `xportr_order()` to prevent character sorting (#149)
* Message is shown on `xportr_*` functions when the metadata being used has multiple variables with the same name in the same domain (#128)
* Fixed an issue with `xport_type()` where `DT`, `DTM` variables with a format specified in the metadata (e.g. date9., datetime20.) were being converted to numeric, which will cause a 10 year difference when reading it back by `read_xpt()`. SAS's uniform start date is 1960 whereas Linux's uniform start date is 1970.

## Documentation

* Moved `{pkgdown}` site to bootswatch. Enabled search and linked slack icon (#122).
* Additional Deep Dive vignette showcasing functions and quality of life utilities for processing `xpts` created (#84)
* Get Started vignette spruced up. Messages are now displayed and link to Deep Dive vignette (#150)

## Deprecation and Breaking Changes

## Deprecation
and Breaking Changes

* The `metacore` argument has been renamed to `metadata` in the following six xportr functions: `xportr_df_label()`, `xportr_format()`, `xportr_label()`, `xportr_length()`, `xportr_order()`, and `xportr_type()`. Please update your code to use the new `metadata` argument in place of `metacore`.

Expand Down
84 changes: 84 additions & 0 deletions R/data.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
#' Analysis Dataset Subject Level
#'
#' An example dataset containing subject level data
#'
#' @format ## `adsl`
#' A data frame with 254 rows and 48 columns:
#' \describe{
#' \item{STUDYID}{Study Identifier}
#' \item{USUBJID}{Unique Subject Identifier}
#' \item{SUBJID}{Subject Identifier for the Study}
#' \item{SITEID}{Study Site Identifier}
#' \item{SITEGR1}{Pooled Site Group 1}
#' \item{ARM}{Description of Planned Arm}
#' \item{TRT01P}{Planned Treatment for Period 01}
#' \item{TRT01PN}{Planned Treatment for Period 01 (N)}
#' \item{TRT01A}{Actual Treatment for Period 01}
#' \item{TRT01AN}{Actual Treatment for Period 01 (N)}
#' \item{TRTSDT}{Date of First Exposure to Treatment}
#' \item{TRTEDT}{Date of Last Exposure to Treatment}
#' \item{TRTDUR}{Duration of Treatment (days)}
#' \item{AVGDD}{Avg Daily Dose (as planned)}
#' \item{CUMDOSE}{Cumulative Dose (as planned)}
#' \item{AGE}{Age}
#' \item{AGEGR1}{Pooled Age Group 1}
#' \item{AGEGR1N}{Pooled Age Group 1 (N)}
#' \item{AGEU}{Age Units}
#' \item{RACE}{Race}
#' \item{RACEN}{Race (N)}
#' \item{SEX}{Sex}
#' \item{ETHNIC}{Ethnicity}
#' \item{SAFFL}{Safety Population Flag}
#' \item{ITTFL}{Intent-To-Treat Population Flag}
#' \item{EFFFL}{Efficacy Population Flag}
#' \item{COMP8FL}{Completers of Week 8 Population Flag}
#' \item{COMP16FL}{Completers of Week 16 Population Flag}
#' \item{COMP24FL}{Completers of Week 24 Population Flag}
#' \item{DISCONFL}{Did the Subject Discontinue the Study}
#' \item{DSRAEFL}{Discontinued due to AE}
#' \item{DTHFL}{Subject Died}
#' \item{BMIBL}{Baseline BMI (kg/m^2)}
#' \item{BMIBLGR1}{Pooled Baseline BMI Group 1}
#' \item{HEIGHTBL}{Baseline Height (cm)}
#' \item{WEIGHTBL}{Baseline Weight (kg)}
#' \item{EDUCLVL}{Years of Education}
#' \item{DISONSDT}{Date of Onset of Disease}
#' \item{DURDIS}{Duration of Disease (Months)}
#' \item{DURDSGR1}{Pooled Disease Duration Group 1}
#' \item{VISIT1DT}{Date of Visit 1}
#' \item{RFSTDTC}{Subject Reference Start Date/Time}
#' \item{RFENDTC}{Subject Reference End Date/Time}
#' \item{VISNUMEN}{End of Trt Visit (Vis 12 or Early Term.)}
#' \item{RFENDT}{Date of Discontinuation/Completion}
#' \item{DCDECOD}{Standardized Disposition Term}
#' \item{DCREASCD}{Reason for Discontinuation}
#' \item{MMSETOT}{MMSE Total}
#' }
"adsl"

#' Example Dataset Specification
#'
#' @format ## `var_spec`
#' A data frame with 216 rows and 19 columns:
#' \describe{
#' \item{Order}{Order of variable}
#' \item{Dataset}{Dataset}
#' \item{Variable}{Variable}
#' \item{Label}{Variable Label}
#' \item{Data Type}{Data Type}
#' \item{Length}{Variable Length}
#' \item{Significant Digits}{Significant Digits}
#' \item{Format}{Variable Format}
#' \item{Mandatory}{Mandatory Variable Flag}
#' \item{Assigned Value}{Variable Assigned Value}
#' \item{Codelist}{Variable Codelist}
#' \item{Common}{Common Variable Flag}
#' \item{Origin}{Variable Origin}
#' \item{Pages}{Pages}
#' \item{Method}{Variable Method}
#' \item{Predecessor}{Variable Predecessor}
#' \item{Role}{Variable Role}
#' \item{Comment}{Comment}
#' \item{Developer Notes}{Developer Notes}
#' }
"var_spec"
2 changes: 1 addition & 1 deletion R/messages.R
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ type_log <- function(meta_ordered, type_mismatch_ind, verbose) {

#' Utility for Lengths
#'
#' @param miss_vars Variables missing from metatdata
#' @param miss_vars Variables missing from metadata
#' @param verbose Provides additional messaging for user
#'
#' @return Output to Console
Expand Down
1 change: 1 addition & 0 deletions R/metadata.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
#' dataset = "test",
#' variable = c("Subj", "Param", "Val", "NotUsed"),
#' type = c("numeric", "character", "numeric", "character"),
#' format = NA,
#' order = c(1, 3, 4, 2)
#' )
#'
Expand Down
19 changes: 12 additions & 7 deletions R/type.R
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,8 @@
#' metadata <- data.frame(
#' dataset = "test",
#' variable = c("Subj", "Param", "Val", "NotUsed"),
#' type = c("numeric", "character", "numeric", "character")
#' type = c("numeric", "character", "numeric", "character"),
#' format = NA
#' )
#'
#' .df <- data.frame(
Expand Down Expand Up @@ -84,6 +85,7 @@ xportr_type <- function(.df,
type_name <- getOption("xportr.type_name")
characterTypes <- c(getOption("xportr.character_types"), "_character")
numericTypes <- c(getOption("xportr.numeric_types"), "_numeric")
format_name <- getOption("xportr.format_name")

## Common section to detect domain from argument or pipes

Expand All @@ -106,8 +108,9 @@ xportr_type <- function(.df,
metadata <- metadata %>%
filter(!!sym(domain_name) == domain)
}
metadata <- metadata %>%
select(!!sym(variable_name), !!sym(type_name))

metacore <- metadata %>%
select(!!sym(variable_name), !!sym(type_name), !!sym(format_name))

# Common check for multiple variables name
check_multiple_var_specs(metadata, variable_name)
Expand All @@ -125,9 +128,13 @@ xportr_type <- function(.df,
# _character is used here as a mask of character, in case someone doesn't
# want 'character' coerced to character
type.x = if_else(type.x %in% characterTypes, "_character", type.x),
type.x = if_else(type.x %in% numericTypes, "_numeric", type.x),
type.x = if_else(type.x %in% numericTypes | (grepl("DT$|DTM$|TM$", variable) & !is.na(format)),
"_numeric",
type.x
),
type.y = if_else(is.na(type.y), type.x, type.y),
type.y = tolower(type.y),
type.y = if_else(type.y %in% characterTypes, "_character", type.y),
type.y = if_else(type.y %in% characterTypes | (grepl("DTC$", variable) & is.na(format)), "_character", type.y),
type.y = if_else(type.y %in% numericTypes, "_numeric", type.y)
)

Expand All @@ -138,7 +145,6 @@ xportr_type <- function(.df,
type_mismatch_ind <- which(meta_ordered$type.x != meta_ordered$type.y)
type_log(meta_ordered, type_mismatch_ind, verbose)


# Check if variable types match
is_correct <- sapply(meta_ordered[["type.x"]] == meta_ordered[["type.y"]], isTRUE)
# Use the original variable iff metadata is missing that variable
Expand All @@ -161,6 +167,5 @@ xportr_type <- function(.df,
}
}, is_correct
)

.df
}
3 changes: 2 additions & 1 deletion R/xportr-package.R
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,8 @@
globalVariables(c(
"abbr_parsed", "abbr_stem", "adj_orig", "adj_parsed", "col_pos", "dict_varname",
"lower_original_varname", "my_minlength", "num_st_ind", "original_varname",
"renamed_n", "renamed_var", "use_bundle", "viable_start", "type.x", "type.y"
"renamed_n", "renamed_var", "use_bundle", "viable_start", "type.x", "type.y",
"variable"
))

# The following block is used by usethis to automatically manage
Expand Down
18 changes: 15 additions & 3 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ data sets (≤ 200)
- Coerces variables to only numeric or character types
- Display format support for numeric float and date/time values
- Variables names are ≤ 8 characters.
- Variable labels are ≤ 200 characters.
- Variable labels are ≤ 40 characters.
- Data set labels are ≤ 40 characters.
- Presence of non-ASCII characters in Variable Names, Labels or data set labels.

Expand All @@ -103,7 +103,7 @@ To do this we will need to do the following:
- Apply a dataset label
- Write out a version 5 xpt file

All of which can be done using a well-defined specification file and the `xportr` package!
All of which can be done using a well-defined specification file and the `{xportr}` package!

First we will start with our `ADSL` dataset created in R. This example `ADSL` dataset is taken from the [`{admiral}`](https://pharmaverse.github.io/admiral/index.html) package. The script that generates this `ADSL` dataset can be created by using this command `admiral::use_ad_template("adsl")`. This `ADSL` dataset has 306 observations and 48 variables.

Expand All @@ -125,7 +125,19 @@ var_spec <- readxl::read_xlsx(spec_path, sheet = "Variables") %>%
rlang::set_names(tolower)
```

Each `xportr_` function has been written in a way to take in a part of the specification file and apply that piece to the dataset.
Each `xportr_` function has been written in a way to take in a part of the specification file and apply that piece to the dataset. Setting `verbose = "warn"` will send appropriate warning message to the console. We have suppressed the warning for the sake of brevity.

```{r, warning = FALSE, message=FALSE, eval=TRUE}
adsl %>%
xportr_type(var_spec, "ADSL", verbose = "warn") %>%
xportr_length(var_spec, "ADSL", verbose = "warn") %>%
xportr_label(var_spec, "ADSL", verbose = "warn") %>%
xportr_order(var_spec, "ADSL", verbose = "warn") %>%
xportr_format(var_spec, "ADSL", verbose = "warn") %>%
xportr_write("adsl.xpt", label = "Subject-Level Analysis Dataset")
```

The `xportr_metadata()` function can reduce duplication by setting the variable specification and domain explicitly at the top of a pipeline. If you would like to use the `verbose` argument, you will need to set in each function call.

```{r, message=FALSE, eval=FALSE}
adsl %>%
Expand Down
23 changes: 20 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ to any validators or data reviewers.
- Coerces variables to only numeric or character types
- Display format support for numeric float and date/time values
- Variables names are ≤ 8 characters.
- Variable labels are ≤ 200 characters.
- Variable labels are ≤ 40 characters.
- Data set labels are ≤ 40 characters.
- Presence of non-ASCII characters in Variable Names, Labels or data set
labels.
Expand All @@ -99,7 +99,7 @@ To do this we will need to do the following:
- Write out a version 5 xpt file

All of which can be done using a well-defined specification file and the
`xportr` package!
`{xportr}` package!

First we will start with our `ADSL` dataset created in R. This example
`ADSL` dataset is taken from the
Expand Down Expand Up @@ -131,7 +131,24 @@ var_spec <- readxl::read_xlsx(spec_path, sheet = "Variables") %>%
```

Each `xportr_` function has been written in a way to take in a part of
the specification file and apply that piece to the dataset.
the specification file and apply that piece to the dataset. Setting
`verbose = "warn"` will send appropriate warning message to the console.
We have suppressed the warning for the sake of brevity.

``` r
adsl %>%
xportr_type(var_spec, "ADSL", verbose = "warn") %>%
xportr_length(var_spec, "ADSL", verbose = "warn") %>%
xportr_label(var_spec, "ADSL", verbose = "warn") %>%
xportr_order(var_spec, "ADSL", verbose = "warn") %>%
xportr_format(var_spec, "ADSL", verbose = "warn") %>%
xportr_write("adsl.xpt", label = "Subject-Level Analysis Dataset")
```

The `xportr_metadata()` function can reduce duplication by setting the
variable specification and domain explicitly at the top of a pipeline.
If you would like to use the `verbose` argument, you will need to set in
each function call.

``` r
adsl %>%
Expand Down
74 changes: 38 additions & 36 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ template:
params:
bootswatch: sandstone
search:
exclude: ['news/index.html']
exclude: ["news/index.html"]
news:
cran_dates: true

Expand All @@ -18,39 +18,41 @@ navbar:
href: https://pharmaverse.slack.com/archives/C030EB2M4GM
aria-label: slack


reference:
- title: The six core xportr functions
- contents:
- xportr_type
- xportr_length
- xportr_label
- xportr_write
- xportr_format
- xportr_order

- title: xportr helper functions
- contents:
- label_log
- length_log
- type_log
- var_names_log
- var_ord_msg
- xportr_logger
- xportr_df_label
- xportr_metadata

- title: xportr
navbar: ~
contents:
- xportr

- title: internal
contents:
- cli_theme_tests
- expect_attr_width
- minimal_metadata
- minimal_table



- title: The six core xportr functions
- contents:
- xportr_type
- xportr_length
- xportr_label
- xportr_write
- xportr_format
- xportr_order

- title: xportr helper functions
- contents:
- label_log
- length_log
- type_log
- var_names_log
- var_ord_msg
- xportr_logger
- xportr_df_label
- xportr_metadata

- title: xportr example datasets and specification files
- contents:
- adsl
- var_spec

- title: internal
contents:
- cli_theme_tests
- expect_attr_width
- minimal_metadata
- minimal_table

articles:
- title: ~
navbar: ~
contents:
- deepdive
Binary file added data/adsl.rda
Binary file not shown.
Binary file added data/var_spec.rda
Binary file not shown.
Binary file not shown.
Binary file added example_data_specs/TDF_ADaM_Pilot3.xlsx
Binary file not shown.
Binary file added example_data_specs/adadas.xpt
Binary file not shown.
Binary file added example_data_specs/adae.xpt
Binary file not shown.
Binary file added example_data_specs/adlbc.xpt
Binary file not shown.
Binary file added example_data_specs/adtte.xpt
Binary file not shown.
1 change: 1 addition & 0 deletions example_data_specs/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Data taken from Pilot 3 Submission Study: https://github.com/RConsortium/submissions-pilot3-adam
Loading

0 comments on commit 191c885

Please sign in to comment.