Skip to content

Commit

Permalink
update: #84 merge in get started updates
Browse files Browse the repository at this point in the history
Merge remote-tracking branch 'origin/150_clean_up_gs' into 84_xportr_deep_dive_vignette
# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
  • Loading branch information
bms63 committed Jun 11, 2023
2 parents 6fffb8f + ba7e540 commit 58dee7b
Show file tree
Hide file tree
Showing 2 changed files with 42 additions and 81 deletions.
3 changes: 2 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,8 @@
## Documentation

* Moved `{pkgdown}` site to bootswatch. Enabled search and linked slack icon (#122).
* Additional vignette showcasing functions and quality of life utilities for processing `xpts` created (#84)
* Additional Deep Dive vignette showcasing functions and quality of life utilities for processing `xpts` created (#84)
* Get Started vignette spruced up. Messages are now displayed and link to Deep Dive vignette (#150)


## Deprecation
Expand Down
120 changes: 40 additions & 80 deletions vignettes/xportr.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -19,15 +19,6 @@ knitr::opts_chunk$set(
library(DT)
options(cli.num_colors = 1)
options(
xportr.variable_name = "variable",
xportr.label = "label",
xportr.type_name = "type",
xportr.format = "format",
xportr.length = "length",
xportr.order_name = "order"
)
```

```{r, include=FALSE}
Expand All @@ -47,7 +38,10 @@ local({

# Getting Started with xportr

The demo will make use of a small `ADSL` data set that is apart of the [`{admiral}`](https://pharmaverse.github.io/admiral/index.html) package. The script that generates this `ADSL` dataset can be created by using this command `admiral::use_ad_template("adsl")`.
The demo will make use of a small `ADSL` data set that is apart of the [`{admiral}`](https://pharmaverse.github.io/admiral/index.html) package.
The script that generates this `ADSL` dataset can be created by using this command
`admiral::use_ad_template("adsl")`. For a deeper discussion into `{xportr}` be sure
to check out the [Deep Dive](deepdive.html) User Guide.

The `ADSL` has the following features:

Expand Down Expand Up @@ -75,114 +69,86 @@ library(dplyr)
library(labelled)
library(xportr)
library(admiral)
library(rlang)
library(readxl)
# Loading in our example data
adsl <- admiral::admiral_adsl
```

<br>


```{r, echo = FALSE}
DT::datatable(adsl, options = list(
autoWidth = FALSE, scrollX = TRUE, pageLength = 5,
lengthMenu = c(5, 10, 15, 20)
))
```
<br>

**NOTE:** Dataset can be created by using this command `admiral::use_ad_template("adsl")`.
**NOTE:** The `ADSL` dataset can be created by using this command `admiral::use_ad_template("adsl")`.

# Preparing your Specification Files

<br>


In order to make use of the functions within `xportr` you will need to create an R data frame that contains your specification file. You will most likely need to do some pre-processing of your spec sheets after loading in the spec files for them to work appropriately with the `xportr` functions. Please see our example spec sheets in `system.file(paste0("specs/", "ADaM_admiral_spec.xlsx"), package = "xportr")` to see how `xportr` expects the specification sheets.

<br>
In order to make use of the functions within `{xportr}` you will need to create an R data frame that contains your specification file. You will most likely need to do some pre-processing of your spec sheets after loading in the spec files for them to work appropriately with the `xportr` functions. Please see our example spec sheets in `system.file(paste0("specs/", "ADaM_admiral_spec.xlsx"), package = "xportr")` to see how `xportr` expects the specification sheets.

```{r}
var_spec <- readxl::read_xlsx(
var_spec <- read_xlsx(
system.file(paste0("specs/", "ADaM_admiral_spec.xlsx"), package = "xportr"),
sheet = "Variables"
) %>%
dplyr::rename(type = "Data Type") %>%
rlang::set_names(tolower)
rename(type = "Data Type") %>%
set_names(tolower)
```

<br>

Below is a quick snapshot of the specification file pertaining to the `ADSL` data set, which we will make use of in the 6 `xportr` function calls below. Take note of the order, label, type, length and format columns.

<br>
Below is a quick snapshot of the specification file pertaining to the `ADSL` data set, which we will make use of in the 6 `{xportr}` function calls below. Take note of the order, label, type, length and format columns.

```{r, echo = FALSE, eval = TRUE}
var_spec_view <- var_spec %>% filter(dataset == "ADSL")
var_spec_view <- var_spec %>%
filter(dataset == "ADSL")
DT::datatable(var_spec_view, options = list(
autoWidth = FALSE, scrollX = TRUE, pageLength = 5,
lengthMenu = c(5, 10, 15, 20)
))
```

<br>

# xportr_type()

<br>

In order to be compliant with transport v5 specifications an `xpt` file can only have two data types: character and numeric/dbl. Currently the `ADSL` data set has chr, dbl, time, factor and date.

```{r, eval = TRUE}
look_for(adsl, details = TRUE)
```{r, max.height='300px', attr.output='.numberLines', echo = FALSE}
str(adsl)
```

<br>

Using `xport_type` and the supplied specification file, we can *coerce* the variables in the `ADSL` set to be either numeric or character.
Using `xportr_type()` and the supplied specification file, we can *coerce* the variables in the `ADSL` set to be either numeric or character.

<br>

```{r, warning=FALSE, message=FALSE, echo = TRUE, results='hide'}
```{r, echo = TRUE}
adsl_type <- xportr_type(adsl, var_spec, domain = "ADSL", verbose = "message")
```

<br>

Now all appropriate types have been applied to the dataset as seen below.
<br>

```{r, eval = TRUE}
look_for(adsl_type, details = TRUE)

```{r, max.height='300px', attr.output='.numberLines', echo = FALSE}
str(adsl_type)
```

# xportr_length()

<br>

Next we can apply the lengths from a variable level specification file to the data frame. `xportr_length` will identify variables that are missing from your specification file. The function will also alert you to how many lengths have been applied successfully. Before we apply the lengths lets verify that no lengths have been applied to the original dataframe.

<br>



```{r, echo = TRUE, eval = FALSE}
str(adsl)
```
Next we can apply the lengths from a variable level specification file to the data frame. `xportr_length()` will identify variables that are missing from your specification file. The function will also alert you to how many lengths have been applied successfully. Before we apply the lengths lets verify that no lengths have been applied to the original dataframe.

```{r, max.height='300px', attr.output='.numberLines', echo = FALSE}
str(adsl)
```

<br>

No lengths have been applied to the variables as seen in the printout - the lengths would be in the `attr` part of each variables. Let's now use `xportr_length` to apply our lengths from the specification file.
<br>
No lengths have been applied to the variables as seen in the printout - the lengths would be in the `attr()` part of each variables. Let's now use `xportr_length()` to apply our lengths from the specification file.

```{r}
adsl_length <- adsl %>% xportr_length(var_spec, domain = "ADSL", "message")
```

<br>

```{r, max.height='300px', attr.output='.numberLines', echo = TRUE}
str(adsl_length)
Expand All @@ -192,9 +158,9 @@ Note the additional `attr(*, "width")=` after each variable with the width. The

# xportr_order()

Please note that the order of the `ADSL` variables, see above, does not match specification file order column. We can quickly remedy this with a call to `xportr_order()`. Note that the variable `SITEID` has been moved as well as many others to match the specification file order column.
Please note that the order of the `ADSL` variables, see above, does not match the specification file `order` column. We can quickly remedy this with a call to `xportr_order()`. Note that the variable `SITEID` has been moved as well as many others to match the specification file order column. Variables not in the spec are moved to the end of the data and a message is written to the console.

```{r, warning=FALSE, message=FALSE, echo = TRUE, results='hide'}
```{r, echo = TRUE}
adsl_order <- xportr_order(adsl, var_spec, domain = "ADSL", verbose = "message")
```

Expand All @@ -207,7 +173,7 @@ DT::datatable(adsl_order, options = list(

# xportr_format()

Now we apply formats to the dataset. These will typically be `DATE9.`, `DATETIME20` or `TIME5`, but many others can be used. Notice that 8 Date/Time variables are missing a format in our `ADSL` dataset. Here we just take a peak at a few `TRT` variables, which have a `NULL` format.
Now we apply formats to the dataset. These will typically be `DATE9.`, `DATETIME20` or `TIME5`, but many others can be used. Notice that in the `ADSL` dataset there are 8 Date/Time variables and they are missing formts. Here we just take a peak at a few `TRT` variables, which have a `NULL` format.

```{r}
attr(adsl$TRTSDT, "format.sas")
Expand All @@ -216,7 +182,7 @@ attr(adsl$TRTSDTM, "format.sas")
attr(adsl$TRTEDTM, "format.sas")
```

Using our `xportr_format()` we apply our formats.
Using our `xportr_format()` we can cpply our formats to the dataset.

```{r}
adsl_fmt <- adsl %>% xportr_format(var_spec, domain = "ADSL", "message")
Expand All @@ -231,35 +197,27 @@ attr(adsl_fmt$TRTEDTM, "format.sas")

# xportr_label()

<br>

Please observe that our `ADSL` dataset is missing many variable labels. Sometimes these labels can be lost while using R's function. However, A CDISC compliant data set needs to have each variable with a variable label.
Please observe that our `ADSL` dataset is missing many variable labels. Sometimes these labels can be lost while using R's function. However, a CDISC compliant data set needs to have each variable with a label.

```{r, eval = TRUE}
look_for(adsl, details = FALSE)
```{r, max.height='300px', attr.output='.numberLines', echo = TRUE}
str(adsl)
```

<br>

Using the `xport_label` function we can take the specifications file and label all the variables available. `xportr_label` will produce a warning message if you the variable in the data set is not in the specification file.

<br>

```{r}
adsl_update <- adsl %>% xportr_label(var_spec, domain = "ADSL", "message")
adsl_lbl <- adsl %>% xportr_label(var_spec, domain = "ADSL", "message")
```

```{r}
look_for(adsl_update, details = FALSE)
```{r, max.height='300px', attr.output='.numberLines', echo = TRUE}
str(adsl_lbl)
```

# xportr_write()

<br>
Finally, we arrive at exporting the R data frame object as a `xpt` file with `xportr_write()`. The `xpt` file will be written directly to your current working directory. To make it more interesting, we have put together all six functions with the magrittr pipe, `%>%`. A user can now apply types, length, variable labels, formats, data set label and write out their final xpt file in one pipe! Appropriate warnings and messages will be supplied to a user to the console for any potential issues before sending off to standard clinical data set validator application or data reviewers.

Finally, we arrive at exporting the R data frame object as a xpt file with the function `xportr_write()`. The xpt file will be written directly to your current working directory. To make it more interesting, we have put together all six functions with the magrittr pipe, `%>%`. A user can now apply types, length, variable labels, formats, data set label and write out their final xpt file in one pipe! Appropriate warnings and messages will be supplied to a user to the console for any potential issues before sending off to standard clinical data set validator application or data reviewers.

```{r, eval=FALSE}
```{r}
adsl %>%
xportr_type(var_spec, "ADSL", "message") %>%
xportr_length(var_spec, "ADSL", "message") %>%
Expand All @@ -269,7 +227,9 @@ adsl %>%
xportr_write("adsl.xpt", label = "Subject-Level Analysis Dataset")
```

That's it! We now have a xpt file created in R with all appropriate types, lengths, labels, ordering and formats from our specification file.
That's it! We now have a `xpt` file created in R with all appropriate types, lengths, labels, ordering and formats from our specification file. If you are interested in exploring more of the custom
warnings and error messages as well as more background on `xpt` generation be sure
to check out the [Deep Dive](deepdive.html) User Guide.

As always, we welcome your feedback. If you spot a bug, would like to
see a new feature, or if any documentation is unclear - submit an issue
Expand Down

0 comments on commit 58dee7b

Please sign in to comment.