Call for Tables - Lab Shell #29

Pierre-Wallet · 2021-12-06T19:48:30Z

Hello all,
sharing here a template for lab output that is, according to me, gathering the most common difficulties in terms of formatting such as: one page per parameter, column spanners and cell merges.

I hope it helps!
Pierre.

gmbecker · 2021-12-06T20:26:16Z

Hi @Pierre-Wallet this is great, one point of clarification. Can you tell us what the various percentages are out of?

For context in my work on rtables I've seen percentages that are out of column total, that are out of overall total, that are out of alternative column totals (ie column "totals" that do not reflect the data subsets corresponding to the actual columns), out of row group total, etc.

Thanks!

Pierre-Wallet · 2021-12-06T22:45:44Z

Hi @gmbecker,
baseline percentage is based on N.
Percentage for worst post-baseline value is based on Baseline n.

elong0527 · 2022-01-06T18:18:33Z

If the challenge is only for table format, that will be great if you could share reprex R code in preparing the data that's ready for output.

If we could all based on the same data (e.g. https://cran.r-project.org/web/packages/safetyData/index.html). That will be helpful for us to understand the challenge in format this table using different R packages

hughjonesd · 2022-01-12T14:29:22Z

I took a shot at this with huxtable.

library(huxtable)

set.seed(123)
dat <- data.frame(
  Treatment        = rep(c("A", "B"), 100),
  `Baseline Grade` = sample(c(0:4, NA), 200, replace = TRUE),
  PostGrade        = sample(c(0:4, NA), 200, replace = TRUE),
  check.names      =  FALSE
)

tbl <- xtabs( ~ PostGrade + `Baseline Grade` + Treatment, data = dat, addNA = TRUE)

ptbl <- proportions(tbl)
tbl <- addmargins(tbl, 1:3)
ptbl <- addmargins(ptbl, 1:3)
ptbl <- ptbl*100

joined <- tbl
joined[] <- paste0(tbl, " (", ptbl, ")")

output <- ftable(joined, col.vars = "PostGrade", row.vars = c("Treatment", "Baseline Grade"))

output <- as_hux(output)

# put 'Sum' column early, remove one empty column
output <- output[c(1:2, 10, 4:9)]
# tweak some headers
output[2, 1] <- ""
output[2, 2] <- ""
output[1, 3] <- ""

output |> 
      insert_row("", "Baseline", "", "Worst post-baseline value", fill = "") |> 
      set_colspan(1, 4, 6) |> 
      set_bottom_border(1, 4:9) |> 
      set_colspan(1, 2, 2) |> 
      set_align(1, everywhere, "center") |> 
      set_bold(1, everywhere) |> 
      insert_row("Treatment", "", fill = "n (%)", after = 2) |> 
      set_bottom_border(3, everywhere) |> 
      set_number_format(NA) |>
      map_contents(by_regex("NA" = "Missing"))

which gives

                                Baseline                          Worst post-baseline value                                
                                              ─────────────────────────────────────────────────────────────────────────
                                                0           1           2           3           4           Missing    
          Treatment                 n (%)       n (%)       n (%)       n (%)       n (%)       n (%)       n (%)      
        ───────────────────────────────────────────────────────────────────────────────────────────────────────────────
                                                                                                                       
          A           0             17 (8.5)    3 (1.5)     3 (1.5)     2 (1)       3 (1.5)     4 (2)       2 (1)      
                      1             16 (8)      3 (1.5)     3 (1.5)     5 (2.5)     2 (1)       3 (1.5)     0 (0)      
                      2             14 (7)      1 (0.5)     3 (1.5)     2 (1)       2 (1)       5 (2.5)     1 (0.5)    
                      3             16 (8)      3 (1.5)     0 (0)       3 (1.5)     5 (2.5)     3 (1.5)     2 (1)      
                      4             19 (9.5)    5 (2.5)     3 (1.5)     7 (3.5)     2 (1)       0 (0)       2 (1)      
                      Missing       18 (9)      3 (1.5)     2 (1)       1 (0.5)     6 (3)       4 (2)       2 (1)      
                      Sum           100 (50)    18 (9)      14 (7)      20 (10)     20 (10)     19 (9.5)    9 (4.5)    
          B           0             16 (8)      4 (2)       2 (1)       1 (0.5)     0 (0)       3 (1.5)     6 (3)      
                      1             15 (7.5)    5 (2.5)     3 (1.5)     2 (1)       0 (0)       3 (1.5)     2 (1)      
                      2             18 (9)      3 (1.5)     5 (2.5)     3 (1.5)     1 (0.5)     5 (2.5)     1 (0.5)    
                      3             12 (6)      6 (3)       3 (1.5)     0 (0)       1 (0.5)     1 (0.5)     1 (0.5)    
                      4             15 (7.5)    1 (0.5)     8 (4)       1 (0.5)     2 (1)       1 (0.5)     2 (1)      
                      Missing       24 (12)     3 (1.5)     6 (3)       1 (0.5)     5 (2.5)     6 (3)       3 (1.5)    
                      Sum           100 (50)    22 (11)     27 (13.5)   8 (4)       9 (4.5)     19 (9.5)    15 (7.5)   
          Sum         0             33 (16.5)   7 (3.5)     5 (2.5)     3 (1.5)     3 (1.5)     7 (3.5)     8 (4)      
                      1             31 (15.5)   8 (4)       6 (3)       7 (3.5)     2 (1)       6 (3)       2 (1)      
                      2             32 (16)     4 (2)       8 (4)       5 (2.5)     3 (1.5)     10 (5)      2 (1)      
                      3             28 (14)     9 (4.5)     3 (1.5)     3 (1.5)     6 (3)       4 (2)       3 (1.5)    
                      4             34 (17)     6 (3)       11 (5.5)    8 (4)       4 (2)       1 (0.5)     4 (2)      
                      Missing       42 (21)     6 (3)       8 (4)       2 (1)       11 (5.5)    10 (5)      5 (2.5)    
                      Sum           200 (100)   40 (20)     41 (20.5)   28 (14)     29 (14.5)   38 (19)     24 (12)    

Column names: V1, V2, V10, V4, V5, V6, V7, V8, V9

Indeed, not easy.

Getting proportions and addmargins right was very hard. These two commands seem not to know about each other. It's easy to end up with a "grand total" of e.g. 600%. A single command to handle both might be useful.
The trick with setting the contents of joined[] is quite advanced R and most users shouldn't have to know about it. But without it there's no quick way to get a table of the right structure.
And huxtable is finicky... my bad. I wonder if tableone or similar might be useful here.

elong0527 · 2022-01-12T15:58:13Z

Here is an example using r2rtf that allow you to display one parameter per page. (I simply copy the same number for different parameters for illustration purpose)

I just use default border type and font type in r2rtf. They can be changed if necessary.
github does not allow me to attach RTF file. So I just transfer it to PDF.

output file: tmp.pdf

library(dplyr)
library(tidyr)
library(r2rtf)

set.seed(123)
dat <- data.frame(
  Treatment        = rep(c("A", "B"), 100),
  `Baseline Grade` = sample(c(paste("Grade", 0:4), "Missing"), 200, replace = TRUE),
  PostGrade        = sample(c(paste("Grade", 0:4), "Missing"), 200, replace = TRUE),
  check.names      =  FALSE
)

tbl <- xtabs( ~ PostGrade + `Baseline Grade` + Treatment, data = dat, addNA = TRUE)
ptbl <- proportions(tbl)
tbl <- addmargins(tbl, 1:3)
ptbl <- addmargins(ptbl, 1:3)
ptbl <- ptbl*100 


t1 <- as.data.frame(tbl) %>%
  pivot_wider(id_cols = c("Treatment", "Baseline.Grade"), 
              names_from = "PostGrade", 
              values_from = "Freq")

t2 <- as.data.frame(ptbl) %>%
  mutate(Freq = paste0("(", formatC(Freq, digits = 1, width = 5, format = "f"), ")")) %>%
  pivot_wider(id_cols = c("Treatment", "Baseline.Grade"), 
              names_prefix = "pct_", 
              names_from = "PostGrade", 
              values_from = "Freq")

tbl0 <- merge(t1, t2) %>%
  rename(baseline = `Baseline.Grade`) %>%
  select(Treatment, baseline, ends_with("Sum"), ends_with(as.character(c(0:4))), ends_with("Missing"))

# Create multiple parameters
tbl1 <- tbl 
tbl1$parameter <- "Parameter A"

tbl2 <- tbl 
tbl2$parameter <- "Parameter B"

tbl <- rbind(tbl1, tbl2)



tbl %>% 
  rtf_page(orientation = "landscape", 
           col_width = 10,
           nrow = 30) %>%
  rtf_title("Shift Table Example") %>%
  rtf_colheader(c("| | Baseline | Worst post-baseline value"), 
                col_rel_width = c(4, 2, 2, 10)) %>%
  rtf_colheader("| Total | Grade 0 | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Missing", 
                border_top = c("", "",rep("single", 6)), 
                col_rel_width = c(4, rep(2, 7))) %>% 
  rtf_colheader("Treatment | Grade | n | (%) | n | (%) | n | (%) | n | (%) | n | (%) | n | (%) | n | (%)",
                col_rel_width = c(2, 2, rep(1, 14)), 
                border_left = c("single","single", rep(c("single", ""), 7))) %>%
  rtf_body(col_rel_width = c(2, 2, rep(1, 14), 1), 
           text_justification = c("l","l", rep("c", 14), "l"), 
           border_left = c("single","single", rep(c("single", ""), 7), "single"), 
           group_by = "Treatment", 
           subline_by = "parameter") %>%
  rtf_encode() %>%
  write_rtf("tmp.rtf")

ianmoran11 · 2022-01-14T17:04:21Z

Hi,

I've been keeping an eye on this repo since noticing my package, mmtable2, was mentioned in your call for table examples.

Below is some code outlining how the table above would be constructed with mmtable2.

I'm still working on this package, so any suggestions would be more than welcome!

# Load packages ---------------------------------------------------------------------
library(tidyverse)
library(mmtable2)
set.seed(123)

# Create frequency data frame ----------------------------------------------------
freq_df <-
  crossing(
    treatment = c("A", "B"),
    baseline = c(paste("Grade", 0:4), "Missing"),
    postbaseline = c(paste("Grade", 0:4), "Missing", " ")
  ) %>%
  mutate(n = sample(0:20, nrow(.), replace = T)) %>%
  mutate(`(%)` = 100 * n / sum(n)) %>%
  mutate(prepost = ifelse(postbaseline == " ", "Baseline", "Worst post-baseline value"))

# Calculate totals within treatments --------------------------------------------
total_df <-
  freq_df %>%
  group_by(treatment, prepost, postbaseline) %>%
  summarise(n = sum(n), `(%)` = sum(`(%)`)) %>%
  mutate(baseline = "Total")

# Calculate sums across all subjects --------------------------------------------
sum_df <-
  bind_rows(freq_df, total_df) %>%
  group_by(baseline, postbaseline, prepost) %>%
  summarise(n = sum(n), `(%)` = sum(`(%)`)) %>%
  mutate(treatment = "Sum")

# Preare data for table --------------------------------------------------------------
final_df <-
  bind_rows(freq_df,total_df,sum_df) %>%
  gather(unit, val, n, `(%)`) %>%
  mutate(unit = fct_relevel(unit, "n", "(%)")) %>%
  mutate(val = ifelse(unit == "(%)", sprintf("(%.1f)", val), val))

# Print dataframe used to construct main table -------------------------------
final_df
## # A tibble: 294 x 6
##    treatment baseline postbaseline prepost                   unit  val  
##    <chr>     <chr>    <chr>        <chr>                     <fct> <chr>
##  1 A         Grade 0  " "          Baseline                  n     14   
##  2 A         Grade 0  "Grade 0"    Worst post-baseline value n     18   
##  3 A         Grade 0  "Grade 1"    Worst post-baseline value n     13   
##  4 A         Grade 0  "Grade 2"    Worst post-baseline value n     2    
##  5 A         Grade 0  "Grade 3"    Worst post-baseline value n     9    
##  6 A         Grade 0  "Grade 4"    Worst post-baseline value n     17   
##  7 A         Grade 0  "Missing"    Worst post-baseline value n     10   
##  8 A         Grade 1  " "          Baseline                  n     4    
##  9 A         Grade 1  "Grade 0"    Worst post-baseline value n     19   
## 10 A         Grade 1  "Grade 1"    Worst post-baseline value n     13   
## # … with 284 more rows

# Create table ----------------------------------------------------------------------------

final_df %>%
  mmtable(cells = val) +
    header_top(unit) +
    header_top_left(postbaseline) +
    header_left(baseline) +
    header_left_top(treatment) +
    header_top_left(prepost) +
    header_merged_cols() +
    header_format(prepost,  list(cell_text(align = "center"))) +
    cells_format(cell_predicate = unit == "(%)", list(cell_text(align = "left")))

Pierre-Wallet · 2022-02-04T11:31:23Z

Hi all,

@elong0527 @hughjonesd @gmbecker @ianmoran11 @davidgohel

sorry for my late answer.
Thanks for the answers you provided, it actually helped me to optimize my solution.

In the solutions above, the percentages are not correctly calculated. The baseline percentage (1st column) is out of the number of patient in each treatment arm, and not out of the total number of patients.
It was not straightforward to get it, but the arrays are quite nice to perform such calculations.

I did not focused on the sum of the 2 treatment arms, as it is optional in the display.

I tried my best to get it with flextable+officer:
reprex_report.docx

Here is my code ( I added 2 parameters, the goal is to have one parameter per page)

library(haven)
library(tidyverse)
library(abind)
library(flextable) 
library(officer)
library(magrittr)

set.seed(123)
dat <- data.frame(
  Treatment        = rep(c("A", "B"), 100),
  Parameter        = sample(c("Param1", "Param2"), 200, replace = TRUE),
  `Baseline Grade` = sample(c(paste0("Grade ", 0:4) , NA), 200, replace = TRUE),
  PostGrade        = sample(c(paste0("Grade ", 0:4) , NA), 200, replace = TRUE),
  check.names      =  FALSE
)


dat1 <- dat %>%
  mutate(Baseline = factor(`Baseline Grade`, levels = c("Grade 0", "Grade 1", "Grade 2", "Grade 3", "Grade 4") ),
         Wpbaseline = factor(PostGrade, levels = c("Grade 0", "Grade 1", "Grade 2", "Grade 3", "Grade 4") ),
         TRT01AN = factor(Treatment, levels = c("A","B") ) )

# Calculation part ----

# Store frequencies in an array:
tbl <- xtabs( ~ Baseline + Wpbaseline + TRT01AN + Parameter, data = dat1, addNA = TRUE)

# add sum over first dimension (Baseline):
tbl2 <- addmargins(A = tbl, margin = c(1), FUN = sum)

# Get percentages in rows: freeze all dimensions except the second (Wpbaseline):
ptbl2 <- proportions(tbl2, margin = c(1,3,4))

# at this stage tbl2 and ptbl have same dimensions
joined <- tbl2
joined[] <- paste0(tbl2, " (", round(100*ptbl2,digits = 1), ")")

# add sum over second dimension (Wpbaseline) to get the total for each grade at baseline:
tbl3 <- addmargins(A = tbl, margin = c(2), FUN = sum)

# Get total and percentages for Totals at Baseline:
ptbl3 <- proportions(tbl3, margin = c(2,3,4))

tbl4 <- addmargins(A = tbl3, margin = c(1), FUN = sum)
ptbl4 <- addmargins(A = ptbl3, margin = c(1), FUN = sum)

# Get total and percentages at Baseline together
joined2 <- tbl4
joined2[] <- paste0(tbl4, " (", sprintf("%1.1f",round(100*ptbl4,digits = 1)), ")")

# retrieve the info concerning the sum and percentages at baseline
baseline <- joined2[,7,,]
dim(baseline)<-c(7,1,2,2)

# concatenate the baseline information with the previous array thanks to abind package
joined3 <- abind(baseline, joined, along = 2)

# Cosmetic
dimnames(joined3)
names(dimnames(joined3))<-c('Baseline', 'Wpbaseline', 'Treatment', 'Parameter')

dimnames(joined3)$Baseline[which(dimnames(joined3)$Baseline == "sum")] <- "Total"
dimnames(joined3)$Baseline[is.na(dimnames(joined3)$Baseline)] <- "Missing"

dimnames(joined3)$Wpbaseline[which(dimnames(joined3)$Wpbaseline == "")] <- "Total"
dimnames(joined3)$Wpbaseline[is.na(dimnames(joined3)$Wpbaseline)] <- "Missing"

joined3[which(joined3=="0 (0)")]<-"0"
joined3[which(joined3=="0 (0.0)")]<-"0"
joined3[which(joined3=="0 (NaN)")]<-"0"

# Build back a data frame to do the reporting part
for (j in 1:dim(joined3)[4]) {
  for (i in 1:dim(joined3)[3]) {
    df_current <- 
      data.frame(joined3[,,i,j]) %>% 
      as_tibble() %>%
      # Adding the the lab parameter, treatment and Baseline grade in the newly created df
      add_column(Parameter = dimnames(joined3)[[4]][j],
                 TRT01AN = dimnames(joined3)[[3]][i],
                 Bgrade = rownames(joined3[,,i,j]),
                 .before = "Total") %>%
      remove_rownames()
    
    #Concatenation of df created at each step
    if (i==1 & j==1){ df_final<-df_current}
    else df_final %<>% bind_rows(df_current)
    
  }
}
  
# Reporting part----
doc <- read_docx() 
for(i in as.vector(unique(df_final$Parameter))){
  doc <- body_add_par(doc, "Table 14.3_3_3.4", style = "centered") %>%                               # Title
    body_add_par("Hematology shift table based on CTC grade by treatment", style = "centered") %>%    # Title
    body_add_par("Safety Set", style = "centered") %>%                                                # Title
    body_add_par(paste0("Parameter =", i), style = "Normal") %>%                                      # Line for laboratory parameter names
    body_add_par(" ") %>%                                                                             # Line break          
    body_add_flextable( flextable(df_final[df_final$Parameter==i,-1]) %>%  # remove parameter variable
                          # Change width of TRT01AN variable
                          width(j = ~ TRT01AN, width = 3) %>% 
                          # Merge cell of variable Parameter and TRT01AN vertically 
                          merge_v( j = ~   TRT01AN , part = "body") %>%
                          # Add header to specified Baseline and Worst post baseline value
                          add_header( TRT01AN = "Treatment", 
                                      Bgrade = "Baseline",
                                      Total = "Baseline",
                                      Grade.0 = "Worst post baseline value",
                                      Grade.1 = "Worst post baseline value",
                                      Grade.2 = "Worst post baseline value",
                                      Grade.3 = "Worst post baseline value",
                                      Grade.4 = "Worst post baseline value",
                                      Missing = "Worst post baseline value",
                                      top =TRUE) %>% 
                          # Rename variable
                          set_header_labels(TRT01AN = "Treatment",
                                            Bgrade = " ",
                                            Total = "n (%)",
                                            Grade.0 = "Grade 0",
                                            Grade.1 = "Grade 1",
                                            Grade.2 = "Grade 2",
                                            Grade.3 = "Grade 3",
                                            Grade.4 = "Grade 4",
                                            Missing = "Missing") %>%
                          # Add footer section
                          add_footer_lines(values = c(" - Baseline is defined as the last non-missing value prior to or the date of the first dose of study treatment \n - Baseline percentages based on N. Percentage for worst post-baseline values is based on Baseline n. \n - CTC grading version 4.03 is used")) %>%
                          # Merge cell horizontally of header
                          merge_h( i = 1, part = "header") %>% 
                          # Merge cell vertically of header
                          merge_v(j = ~ TRT01AN, part = "header") %>% 
                          # Apply vanilla theme
                          theme_vanilla() %>% 
                          # Add  white line between Baseline and Worst post baseline value of header to separate 
                          border( i = 2, j = 4, border.left = fp_border(color = "white", width = 10), part = "header") %>% 
                          # Remove black line of the top header
                          border( border.top = fp_border(color = "white"), part = "header") %>%
                          # vertical alignment for treatment
                          valign(j=1, valign="top", part="body") %>%
                          # Center text of column 3 to 9 of body 
                          align( j = 3:9, align = "center", part = "body") %>%
                          # Left text of column 1 to 3 of body and footer
                          align( j = 1:2, align = "left", part = "body") %>%
                          align( align = "left", part = "footer") %>%
                          # Center text of header 
                          align( align = "center", part = "header") %>% 
                          # Choose fontsize
                          fontsize(size = 8, part = "all") %>%
                          # Update rows height
                          height_all(height = 0.25, part="body") %>%
                          hrule(rule="exact") %>%
                          # add a border under Baseline and Wpbaseline
                          hline(i = 1, j = c(2,3,4), border = fp_border(color="black"), part = "header")
    ) %>% 
    body_end_section_landscape()    # orientation page : landscape 
  
}
print(doc, target = "~/reprex_report.docx")

NNaikp · 2022-02-05T02:24:30Z

Hi @Pierre-Wallet !

I just tried it using our table package NNtable.

    library(dplyr)
    library(tidyr)
    library(NNtable)
    
    set.seed(123)
    
    # Create frequency data frame ----------------------------------------------------
    freq_df <-
      crossing(
        TRTA = c("A", "B"),
        baseline = c(paste("Grade", 0:4), "Missing"),
        postbaseline = c(paste("Grade", 0:4), "Missing")
      ) %>%
      mutate(N = sample(0:20, nrow(.), replace = T)) 
    
    
    # Totals ----------------------------------------------------------
    tr_tot <- freq_df %>% 
      group_by(TRTA) %>% 
      summarise(N_tot = sum(N))
    
    base_tot <- freq_df %>% 
      group_by(TRTA,baseline) %>% 
      summarise(N_base = sum(N), .groups = "drop") %>% 
      left_join(tr_tot, by = "TRTA") %>% 
      mutate(P_base = 100*(N_base/N_tot),
             TRTANEW = paste0(TRTA," (N=",N_tot,")"),
             lab = "Baseline")
    
    # Update for outputting
    base_tot_f <- base_tot %>% 
      rename("N" = "N_base",
             "P" = "P_base") %>% 
      mutate(postbaseline = " ")
    
    # worst stats
    sum_stats <- freq_df %>% 
      left_join(base_tot, by = c("TRTA","baseline"), copy = FALSE) %>% 
      mutate(P = 100*(N/N_base),
             TRTANEW = paste0(TRTA," (N=",N_tot,")"),
             lab = "Worst post-baseline value") %>% 
      select(colnames(base_tot_f))
    
    all_out <- rbind(base_tot_f,sum_stats) 
    
    
    # Create table ----------------------------------------------------------
    nntable <- NNTable(all_out,"TRTANEW","lab"," " = "baseline","postbaseline",
                       "N","(%)" = "(P)") %>% 
      addTransWide(lab = list(postbaseline = c("N","(%)"))) %>% 
      addUnderScore() %>% 
      addGroupedColumns("TRTANEW") %>% 
      addFormat(format_data = c(N = "%.0f", P = "%.1f"))
      
    
    nntable

Which gives:

    ————————————————————————————————————————————————————————————————————————————————————————————————————
                        Baseline                      Worst post-baseline value                   
                       —————————  ————————————————————————————————————————————————————————————————
                                   Grade 0    Grade 1    Grade 2    Grade 3    Grade 4    Missing 
                       —————————  —————————  —————————  —————————  —————————  —————————  —————————
                        N   (%)    N   (%)    N   (%)    N   (%)    N   (%)    N   (%)    N   (%) 
    ————————————————————————————————————————————————————————————————————————————————————————————————————
                                                                                                  
    A (N=345) Grade 0  71 (20.6)   7 ( 9.9)   6 ( 8.5)  14 (19.7)  18 (25.4)  16 (22.5)  10 (14.1)
              Grade 1  40 (11.6)   6 (15.0)  18 (45.0)   2 ( 5.0)   5 (12.5)   6 (15.0)   3 ( 7.5)
              Grade 2  69 (20.0)   9 (13.0)  16 (23.2)  15 (21.7)   7 (10.1)  20 (29.0)   2 ( 2.9)
              Grade 3  66 (19.1)  11 (16.7)  20 (30.3)  13 (19.7)   4 ( 6.1)  13 (19.7)   5 ( 7.6)
              Grade 4  62 (18.0)   8 (12.9)   0 ( 0.0)  19 (30.6)  13 (21.0)  18 (29.0)   4 ( 6.5)
              Missing  37 (10.7)   1 ( 2.7)   0 ( 0.0)  19 (51.4)   3 ( 8.1)  10 (27.0)   4 (10.8)
                                                                                                  
    B (N=315) Grade 0  33 (10.5)   3 ( 9.1)  18 (54.5)   2 ( 6.1)   0 ( 0.0)   6 (18.2)   4 (12.1)
              Grade 1  60 (19.0)   8 (13.3)  16 (26.7)  20 (33.3)   4 ( 6.7)  11 (18.3)   1 ( 1.7)
              Grade 2  50 (15.9)  17 (34.0)   3 ( 6.0)   9 (18.0)  14 (28.0)   7 (14.0)   0 ( 0.0)
              Grade 3  61 (19.4)  10 (16.4)   4 ( 6.6)  10 (16.4)  14 (23.0)  14 (23.0)   9 (14.8)
              Grade 4  69 (21.9)  14 (20.3)  14 (20.3)  19 (27.5)  11 (15.9)   8 (11.6)   3 ( 4.3)
              Missing  42 (13.3)   2 ( 4.8)   6 (14.3)   3 ( 7.1)   7 (16.7)  20 (47.6)   4 ( 9.5)
    ————————————————————————————————————————————————————————————————————————————————————————————————————

davidgohel · 2022-02-11T18:28:34Z

Hi!

I took time to do some work on flextable for this:

package needs to be installed from github for these new functions.

remotes::install_github("davidgohel/flextable")

few format functions

library(data.table)
library(flextable)

multi_fun <- function(x) {
  list(mean = mean(x),
       sd = sd(x))
}
myformat <- function(z){
  x <- sprintf("%.1f", z)
  x[is.na(z)] <- ""
  x
}
n_format <- function(z) {
  x <- sprintf("%.0f", z)
  x[is.na(z)] <- ""
  x
}

data prep

Aggregations need to be prepared before using flextable.

dat <- as.data.table(ggplot2::diamonds)
dat <- dat[cut %in% c("Fair", "Good", "Very Good")]
dat <- dat[clarity %in% c("I1", "SI1", "VS2")]

nstat <- dat[, list(n = .N),
             by = c("cut")]

dat <- dat[, unlist(lapply(.SD, multi_fun),
                    recursive = FALSE),
           .SDcols = c("z", "y"),
           by = c("cut", "color", "clarity")]
dat
#>           cut color clarity   z.mean      z.sd   y.mean      y.sd
#>  1: Very Good     H     SI1 3.817221 0.6664327 6.171371 1.0754836
#>  2:      Fair     E     VS2 3.529286 0.5245551 5.449524 0.7673392
#>  3:      Good     J     SI1 4.019773 0.7004727 6.430795 1.0808269
#>  4: Very Good     J     SI1 4.017088 0.6838668 6.510000 1.1041829
#>  5: Very Good     E     VS2 3.363797 0.5530067 5.466143 0.8988358
#>  6: Very Good     J     VS2 4.017500 0.6964981 6.513750 1.1436226
#>  7: Very Good     D     VS2 3.326278 0.5282660 5.382913 0.8421964
#>  8:      Good     H     SI1 3.736894 0.6377995 5.987319 1.0611947
#>  9:      Good     D     VS2 3.467596 0.5185969 5.538365 0.8182376
#> 10: Very Good     F     SI1 3.599481 0.5245868 5.827370 0.8484646
#> 11: Very Good     I     SI1 3.936257 0.6808834 6.356983 1.1074043
#> 12:      Good     I     SI1 3.853758 0.7443514 6.147697 1.2262727
#> 13: Very Good     G     SI1 3.560823 0.5933639 5.761730 0.9487383
#> 14:      Good     E     VS2 3.505000 0.5103667 5.677625 0.8019399
#> 15: Very Good     E     SI1 3.471629 0.5479509 5.619233 0.8996010
#> 16: Very Good     D     SI1 3.447955 0.5314568 5.594636 0.8764198
#> 17: Very Good     F     VS2 3.488906 0.5884464 5.661953 0.9465291
#> 18:      Fair     F     VS2 3.593774 0.4990191 5.626226 0.6970742
#> 19: Very Good     J      I1 4.433750 0.5231754 7.186250 0.8307473
#> 20:      Good     I     VS2 4.009364 0.6571069 6.468455 1.0810768
#> 21: Very Good     G     VS2 3.590731 0.6208030 5.814739 0.9953374
#> 22:      Good     G     VS2 3.622865 0.5710884 5.819115 0.8951742
#> 23:      Good     F     VS2 3.531576 0.5507295 5.663641 0.8695877
#> 24: Very Good     H     VS2 3.690452 0.6885978 5.972926 1.1102574
#> 25:      Fair     G     SI1 3.852319 0.4371907 5.922754 0.6362248
#> 26:      Fair     E      I1 4.008889 0.3876030 6.061111 0.4297221
#> 27:      Good     F     SI1 3.568425 0.5056869 5.720879 0.8231311
#> 28:      Good     E     SI1 3.486592 0.5638868 5.567803 0.9138209
#> 29:      Fair     I     SI1 4.102000 0.4462905 6.402667 0.6600676
#> 30:      Fair     G      I1 4.233962 0.7219374 6.434906 1.0849683
#> 31:      Fair     F      I1 4.000857 0.7043536 6.040571 1.0820350
#> 32:      Good     G     SI1 3.735024 0.5568696 5.952464 0.9221867
#> 33:      Fair     J     VS2 3.960000 0.6183629 6.163478 1.0095976
#> 34: Very Good     E      I1 3.955000 0.5511784 6.436818 0.8837549
#> 35:      Fair     J     SI1 4.135357 0.5361142 6.549643 0.8858454
#> 36:      Good     J     VS2 4.007444 0.6286289 6.446000 1.0575940
#> 37: Very Good     I     VS2 3.917372 0.6876389 6.368686 1.1444792
#> 38:      Fair     F     SI1 3.726506 0.5419464 5.871325 0.8438367
#> 39:      Good     I      I1 4.305556 0.5688390 6.978889 1.0471920
#> 40:      Good     H     VS2 3.690000 0.6633151 5.895725 1.0695806
#> 41:      Good     H      I1 4.187857 0.5881807 6.745714 0.7694425
#> 42: Very Good     H      I1 4.600833 0.5658374 7.437500 0.8752259
#> 43:      Fair     H     SI1 4.110533 0.5221341 6.355467 0.8221269
#> 44:      Fair     D     VS2 3.704800 0.4685449 5.886800 0.7054828
#> 45:      Good     D     SI1 3.450844 0.5407419 5.509409 0.8788101
#> 46:      Fair     E     SI1 3.739077 0.5040763 5.903692 0.8057694
#> 47:      Fair     H      I1 4.549231 0.7059102 6.858462 1.0348428
#> 48: Very Good     I      I1 4.628750 0.9238806 7.452500 1.4414749
#> 49:      Fair     I     VS2 3.851875 0.4773342 6.154375 0.7318379
#> 50:      Fair     I      I1 4.405294 0.5181508 6.653529 0.8454222
#> 51: Very Good     G      I1 4.050000 0.5254585 6.525000 0.8003749
#> 52:      Fair     J      I1 4.990435 0.9209850 7.457391 1.3570832
#> 53:      Good     E      I1 4.263913 0.5527760 6.898696 0.7825152
#> 54:      Good     G      I1 3.869474 1.1021720 6.554737 0.9806255
#> 55:      Fair     H     VS2 3.975122 0.5877462 6.161220 0.9045750
#> 56:      Good     F      I1 3.861053 0.5899708 6.120000 0.9517820
#> 57:      Fair     D     SI1 3.864828 0.4383792 5.957069 0.6556823
#> 58:      Fair     G     VS2 3.955556 0.4868617 6.137333 0.7873442
#> 59: Very Good     F      I1 4.125385 0.5468793 6.714615 0.9804218
#> 60: Very Good     D      I1 3.886000 0.2823650 6.242000 0.4232257
#> 61:      Good     D      I1 3.841250 0.6805552 6.293750 1.2610533
#> 62:      Good     J      I1 4.310000 0.5615455 6.982500 0.7329563
#> 63:      Fair     D      I1 4.905000 0.9796768 7.422500 1.4170009
#>           cut color clarity   z.mean      z.sd   y.mean      y.sd
setDF(dat)
setDF(nstat)

flextable creation

grey_txt <- fp_text_default(color = "gray")

cft_2 <- tabulator(
  x = dat, rows = c("cut", "color"),
  cols = "clarity",
  hidden_data = nstat,
  row_compose = list(
    cut = as_paragraph(as_chunk(cut), " (N=", as_chunk(n, formatter = n_format), ")")
  ),
  `z stats` = as_paragraph(
    as_chunk(z.mean, formatter = myformat)),
  `y stats` = as_paragraph(
    as_chunk(y.mean, formatter = myformat),
    as_chunk(" (\u00B1 ", props = grey_txt),
    as_chunk(y.sd, formatter = myformat, props = grey_txt),
    as_chunk(")", props = grey_txt)
  )
)
ft_2 <- as_flextable(cft_2)
ft_2 <- autofit(x = ft_2, add_w = .05)
ft_2 <- color(ft_2, i = ~ cut %in% "Very Good", color = "blue")
ft_2 <- add_header_lines(x = ft_2, "blah blah blah blah blah blah blah blah blah blah")
ft_2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Call for Tables - Lab Shell #29

Call for Tables - Lab Shell #29

Pierre-Wallet commented Dec 6, 2021

gmbecker commented Dec 6, 2021

Pierre-Wallet commented Dec 6, 2021 •

edited

Loading

elong0527 commented Jan 6, 2022 •

edited

Loading

hughjonesd commented Jan 12, 2022 •

edited

Loading

elong0527 commented Jan 12, 2022

ianmoran11 commented Jan 14, 2022 •

edited

Loading

Pierre-Wallet commented Feb 4, 2022 •

edited

Loading

NNaikp commented Feb 5, 2022

davidgohel commented Feb 11, 2022

Call for Tables - Lab Shell #29

Call for Tables - Lab Shell #29

Comments

Pierre-Wallet commented Dec 6, 2021

gmbecker commented Dec 6, 2021

Pierre-Wallet commented Dec 6, 2021 • edited Loading

elong0527 commented Jan 6, 2022 • edited Loading

hughjonesd commented Jan 12, 2022 • edited Loading

elong0527 commented Jan 12, 2022

ianmoran11 commented Jan 14, 2022 • edited Loading

Pierre-Wallet commented Feb 4, 2022 • edited Loading

NNaikp commented Feb 5, 2022

davidgohel commented Feb 11, 2022

few format functions

data prep

flextable creation

Pierre-Wallet commented Dec 6, 2021 •

edited

Loading

elong0527 commented Jan 6, 2022 •

edited

Loading

hughjonesd commented Jan 12, 2022 •

edited

Loading

ianmoran11 commented Jan 14, 2022 •

edited

Loading

Pierre-Wallet commented Feb 4, 2022 •

edited

Loading