Add regression tests for summary.simtrial_gs_wlr() #282

jdblischak · 2024-09-13T18:33:40Z

I am starting to convert summary.simtrial_gs_wlr() to data.table (https://github.com/Merck/simtrial/pull/268/files#r1727564684). In order to ensure I don't change the results, I need regression tests.

jdblischak · 2024-09-13T19:35:25Z

Unfortunately our coverage results were not uploaded to Codecov:

[1] "Rate limit reached. Please upload with the Codecov repository upload token to resolve issue. Expected time to availability: 2408s.

jdblischak · 2024-09-13T19:42:50Z

I ran covr locally

x <- covr::package_coverage()
covr::report(x)

I converted the examples to tests, but there are large gaps in the code coverage. Thus I can't know if my conversion to data.table is being done correctly. @LittleBeannie could you please provide example code that uses these code paths?

jdblischak · 2024-09-13T19:46:03Z

Also, in the example code, there are no rows where cross_upper = -z >= upper_bound is TRUE, and thus the entire pipe below evaluates to an empty table. Could you please provide an example where sim_upper_prob is not returned as NA?

simtrial/R/summary.R

Lines 103 to 113 in 5fe51b9

    
           ans2 <- object |> 
        
             dplyr::left_join(data.frame(analysis = 1:n_analysis, upper_bound = bound)) |> 
        
             dplyr::mutate(cross_upper = -z >= upper_bound) |> 
        
             dplyr::filter(cross_upper == TRUE) |> 
        
             dplyr::group_by(sim_id) |> 
        
             dplyr::filter(dplyr::row_number() == 1) |> 
        
             dplyr::ungroup() |> 
        
             dplyr::group_by(analysis) |> 
        
             dplyr::summarize(n_cross_upper = dplyr::n()) |> 
        
             dplyr::mutate(sim_upper_prob = cumsum(n_cross_upper) / n_sim) |> 
        
             dplyr::select(analysis, sim_upper_prob)

LittleBeannie · 2024-09-17T14:53:54Z

Hi @jdblischak, could you please confirm if the following example is something you are looking for?

library(gsDesign2)
library(simtrial)

# Parameters for enrollment
enroll_rampup_duration <- 4 # Duration for enrollment ramp up
enroll_duration <- 16 # Total enrollment duration
enroll_rate <- define_enroll_rate(
  duration = c(
    enroll_rampup_duration, enroll_duration - enroll_rampup_duration),
  rate = c(10, 30))

# Parameters for treatment effect
delay_effect_duration <- 3 # Delay treatment effect in months
median_ctrl <- 9 # Survival median of the control arm
median_exp <- c(9, 14) # Survival median of the experimental arm
dropout_rate <- 0.001
fail_rate <- define_fail_rate(
  duration = c(delay_effect_duration, 100),
  fail_rate = log(2) / median_ctrl,
  hr = median_ctrl / median_exp,
  dropout_rate = dropout_rate)

# Other related parameters
alpha <- 0.025 # Type I error
beta <- 0.1 # Type II error
ratio <- 1 # Randomization ratio (experimental:control)

# Build a one-sided group sequential design
design <- gs_design_ahr(
  enroll_rate = enroll_rate, fail_rate = fail_rate,
  ratio = ratio, alpha = alpha, beta = beta,
  analysis_time = c(12, 24, 36),
  upper = gs_spending_bound,
  upar = list(sf = gsDesign::sfLDOF, total_spend = alpha),
  lower = gs_spending_bound,
  lpar = list(sf = gsDesign::sfLDOF, total_spend = beta))

# Define cuttings of 2 IAs and 1 FA
ia1_cut <- create_cut(target_event_overall = ceiling(design$analysis$event[1]))
ia2_cut <- create_cut(target_event_overall = ceiling(design$analysis$event[2]))
fa_cut <- create_cut(target_event_overall = ceiling(design$analysis$event[3]))

# Run simulations
simulation <- sim_gs_n(
  n_sim = 3,
  sample_size = ceiling(design$analysis$n[3]),
  enroll_rate = design$enroll_rate,
  fail_rate = design$fail_rate,
  test = wlr,
  cut = list(ia1 = ia1_cut, ia2 = ia2_cut, fa = fa_cut),
  weight = fh(rho = 0, gamma = 0.5))

# Summarize simulations
simulation |> summary(bound = gsDesign::gsDesign(k = 3, test.type = 1, sfu = gsDesign::sfLDOF)$upper$bound)

# Summarize simulation and compare with the planned design
simulation |> summary(design = design)

jdblischak · 2024-09-19T19:56:30Z

could you please confirm if the following example is something you are looking for?

Yes, exactly! I ran {covr} locally, and this example uses the remaining code paths.

Comparing your example above to the current example in ?summary.simtrial_gs_wlr, it looks like the only difference is the arguments lower and lpar passed to gs_design_ahr():

# Build a one-sided group sequential design
design <- gs_design_ahr(
  enroll_rate = enroll_rate, fail_rate = fail_rate,
  ratio = ratio, alpha = alpha, beta = beta,
  analysis_time = c(12, 24, 36),
  upper = gs_spending_bound,
  upar = list(sf = gsDesign::sfLDOF, total_spend = alpha),
  lower = gs_spending_bound,
  lpar = list(sf = gsDesign::sfLDOF, total_spend = beta))

simtrial/R/summary.R

Lines 58 to 66 in 506ce7d

    
           #' # Build a one-sided group sequential design 
        
           #' design <- gs_design_ahr( 
        
           #'   enroll_rate = enroll_rate, fail_rate = fail_rate, 
        
           #'   ratio = ratio, alpha = alpha, beta = beta, 
        
           #'   analysis_time = c(12, 24, 36), 
        
           #'   upper = gs_spending_bound, 
        
           #'   upar = list(sf = gsDesign::sfLDOF, total_spend = alpha), 
        
           #'   lower = gs_b, 
        
           #'   lpar = rep(-Inf, 3))

Is my understanding correct? Also, should the comment Build a one-sided group sequential design be updated to Build a two-sided group sequential design?

jdblischak · 2024-09-19T19:58:30Z

However, in that latest example, the object ans2 still has zero rows. How do we know that the code below is behaving properly?

simtrial/R/summary.R

Lines 103 to 113 in 5fe51b9

    
           ans2 <- object |> 
        
             dplyr::left_join(data.frame(analysis = 1:n_analysis, upper_bound = bound)) |> 
        
             dplyr::mutate(cross_upper = -z >= upper_bound) |> 
        
             dplyr::filter(cross_upper == TRUE) |> 
        
             dplyr::group_by(sim_id) |> 
        
             dplyr::filter(dplyr::row_number() == 1) |> 
        
             dplyr::ungroup() |> 
        
             dplyr::group_by(analysis) |> 
        
             dplyr::summarize(n_cross_upper = dplyr::n()) |> 
        
             dplyr::mutate(sim_upper_prob = cumsum(n_cross_upper) / n_sim) |> 
        
             dplyr::select(analysis, sim_upper_prob)

jdblischak · 2024-09-19T20:02:10Z

Unfortunately our coverage results were not uploaded to Codecov:

It looks like the last time our Codecov results were uploaded was 2 months ago

@nanxstats is it possible to define CODECOV_TOKEN to fix this? (r-lib/actions#834, Merck/r2rtf#221). If not, then I think we should delete the coverage workflow.

LittleBeannie · 2024-09-19T20:02:20Z

Is my understanding correct?

Yes, you are right that the difference lies in the lower and lpar.

Also, should the comment Build a one-sided group sequential design be updated to Build a two-sided group sequential design?

No. The example has lower = gs_b and lpar=rep(-Inf, 3), which gives you a one-sided design. Only when the futility is not all -inf, we call it two-sided design.

LittleBeannie · 2024-09-19T20:04:38Z

However, in that latest example, the object ans2 still has zero rows. How do we know that the code below is behaving properly?

simtrial/R/summary.R

Lines 103 to 113 in 5fe51b9

ans2 <- object |>

dplyr::left_join(data.frame(analysis = 1:n_analysis, upper_bound = bound)) |>

dplyr::mutate(cross_upper = -z >= upper_bound) |>

dplyr::filter(cross_upper == TRUE) |>

dplyr::group_by(sim_id) |>

dplyr::filter(dplyr::row_number() == 1) |>

dplyr::ungroup() |>

dplyr::group_by(analysis) |>

dplyr::summarize(n_cross_upper = dplyr::n()) |>

dplyr::mutate(sim_upper_prob = cumsum(n_cross_upper) / n_sim) |>

dplyr::select(analysis, sim_upper_prob)

Is the "lastest example" one-sided or two-sided?

jdblischak · 2024-09-19T20:11:18Z

Is the "lastest example" one-sided or two-sided?

By "latest example" I mean the one you shared 2 days ago above in #282 (comment). I'm fairly confident it is an example of a two-sided design, but the code comment states it is one-sided

LittleBeannie · 2024-09-19T20:17:05Z

Is the "lastest example" one-sided or two-sided?

By "latest example" I mean the one you shared 2 days ago above in #282 (comment). I'm fairly confident it is an example of a two-sided design, but the code comment states it is one-sided

You're correct that it's a two-sided example. I was copying and pasting, and I was too fixated on the code, so I overlooked the comments...

nanxstats · 2024-09-19T20:57:23Z

It looks like we will need an organization token for the authenticated upload to work. If that's the case, reaching out to the open source committee and see if they can help would a possible next move.

Our R SDLC recommends showing a coverage badge so it might be useful to keep it. I've had anecdotal observations that the code coverage workflow can still upload successfully once a while without a codecov token but not sure if that's helpful.

Unfortunately our coverage results were not uploaded to Codecov:

It looks like the last time our Codecov results were uploaded was 2 months ago

@nanxstats is it possible to define CODECOV_TOKEN to fix this? (r-lib/actions#834, Merck/r2rtf#221). If not, then I think we should delete the coverage workflow.

jdblischak · 2024-09-20T16:01:13Z

I've had anecdotal observations that the code coverage workflow can still upload successfully once a while without a codecov token but not sure if that's helpful.

Let's leave the workflow for now then. The last coverage update was 2 months ago, so it is still at least somewhat accurate. However, if we get to 6+ months of no coverage update, then I would encourage us to remove the workflow and the README badge, since from my perspective an inaccurate coverage badge is worse than no coverage badge.

Our R SDLC recommends showing a coverage badge so it might be useful to keep it.

The SDLC is for clinical submissions, no? And looking through our internal docs, it looks like the recommended coverage badge links to a vignette with the saved coverage info (since presumably you don't want to upload these to Codecov).

It looks like we will need an organization token for the authenticated upload to work. If that's the case, reaching out to the open source committee and see if they can help would a possible next move.

I'll leave that decision to you. Having the coverage reports easily accessible, especially for PRs, is very convenient. But at the end of the day, I can run it locally myself.

jdblischak · 2024-09-23T19:07:07Z

@LittleBeannie I ran sim_gs_n() with n_sim = 10000 (and set.seed(1)), but I still never managed to find any rows with -z >= upper_bound. Could you please investigate and see if you can generate an example that has rows where cross_upper == TRUE?

 ans2 <- object |> 
   dplyr::left_join(data.frame(analysis = 1:n_analysis, upper_bound = bound)) |> 
   dplyr::mutate(cross_upper = -z >= upper_bound) |> 
   dplyr::filter(cross_upper == TRUE) |>

LittleBeannie · 2024-09-23T19:12:57Z

@LittleBeannie I ran sim_gs_n() with n_sim = 10000 (and set.seed(1)), but I still never managed to find any rows with -z >= upper_bound. Could you please investigate and see if you can generate an example that has rows where cross_upper == TRUE?
 ans2 <- object |> 
   dplyr::left_join(data.frame(analysis = 1:n_analysis, upper_bound = bound)) |> 
   dplyr::mutate(cross_upper = -z >= upper_bound) |> 
   dplyr::filter(cross_upper == TRUE) |> 

Hi @jdblischak , could you provide me the sim_gs_n code you have run?

jdblischak · 2024-09-23T21:20:26Z

# Parameters for enrollment
enroll_rampup_duration <- 4 # Duration for enrollment ramp up
enroll_duration <- 16 # Total enrollment duration
enroll_rate <- gsDesign2::define_enroll_rate(
  duration = c(
    enroll_rampup_duration, enroll_duration - enroll_rampup_duration),
  rate = c(10, 30))

# Parameters for treatment effect
delay_effect_duration <- 3 # Delay treatment effect in months
median_ctrl <- 9 # Survival median of the control arm
median_exp <- c(9, 14) # Survival median of the experimental arm
dropout_rate <- 0.001
fail_rate <- gsDesign2::define_fail_rate(
  duration = c(delay_effect_duration, 100),
  fail_rate = log(2) / median_ctrl,
  hr = median_ctrl / median_exp,
  dropout_rate = dropout_rate)

# Other related parameters
alpha <- 0.025 # Type I error
beta <- 0.1 # Type II error
ratio <- 1 # Randomization ratio (experimental:control)

# Build a two-sided group sequential design
design <- gsDesign2::gs_design_ahr(
  enroll_rate = enroll_rate, fail_rate = fail_rate,
  ratio = ratio, alpha = alpha, beta = beta,
  analysis_time = c(12, 24, 36),
  upper = gsDesign2::gs_spending_bound,
  upar = list(sf = gsDesign::sfLDOF, total_spend = alpha),
  lower = gsDesign2::gs_spending_bound,
  lpar = list(sf = gsDesign::sfLDOF, total_spend = beta))

# Define cuttings of 2 IAs and 1 FA
ia1_cut <- create_cut(target_event_overall = ceiling(design$analysis$event[1]))
ia2_cut <- create_cut(target_event_overall = ceiling(design$analysis$event[2]))
fa_cut <- create_cut(target_event_overall = ceiling(design$analysis$event[3]))

# Run simulations
set.seed(1)
simulation <- sim_gs_n(
  n_sim = 10000,
  sample_size = ceiling(design$analysis$n[3]),
  enroll_rate = design$enroll_rate,
  fail_rate = design$fail_rate,
  test = wlr,
  cut = list(ia1 = ia1_cut, ia2 = ia2_cut, fa = fa_cut),
  weight = fh(rho = 0, gamma = 0.5))

# Summarize simulations
observed <- simulation |>
  summary(bound = gsDesign::gsDesign(k = 3, test.type = 1, sfu = gsDesign::sfLDOF)$upper$bound)

LittleBeannie · 2024-09-23T21:27:46Z

Hello @jdblischak , the issue is related to the sign of z. Previously, the z from fh(rho = 0, gamma = 0) was always a negative number. After the pull request at #272, the sign of z has changed to positive numbers. If you modify the code by removing the - before `z, I guess the issue will be resolved.

ans2 <- object |> 
   dplyr::left_join(data.frame(analysis = 1:n_analysis, upper_bound = bound)) |> 
   dplyr::mutate(cross_upper = z >= upper_bound) |> 
   dplyr::filter(cross_upper == TRUE) |>

Merck#282 (comment) Merck#272 Merck#271

jdblischak · 2024-09-24T15:37:41Z

the issue is related to the sign of z. Previously, the z from fh(rho = 0, gamma = 0) was always a negative number. After the pull request at #272, the sign of z has changed to positive numbers.

@LittleBeannie Nice investigation! I'm glad we were able to find the root of the problem

If you modify the code by removing the - before `z, I guess the issue will be resolved.

Done in c4e7b66. I bumped the build number since this update changed the values returned for sim_lower_prob.

Once the CI passes, please review and merge. Thanks!

LittleBeannie

Thanks, @jdblischak!

Add regression tests for summary.simtrial_gs_wlr()

e5d0838

jdblischak requested a review from LittleBeannie September 13, 2024 18:33

jdblischak self-assigned this Sep 13, 2024

jdblischak added 2 commits September 13, 2024 15:11

Return a data frame instead of a tibble

db59fcf

Remove NOTE about wide line truncated in PDF manual

23bf583

Add regression test for summary.simtrial_gs_wlr() with two-sided design

d520c7f

Fix sign of Z for summary.simtrial_gs_wlr()

c4e7b66

Merck#282 (comment) Merck#272 Merck#271

LittleBeannie approved these changes Sep 24, 2024

View reviewed changes

LittleBeannie merged commit 58db5df into Merck:main Sep 24, 2024
7 checks passed

jdblischak deleted the test-summary branch September 24, 2024 15:44

This was referenced Sep 24, 2024

Fix sign of z when passing explicit design to summary.simtrial_gs_wlr() #286

Merged

Convert summary.simtrial_gs_wlr() to data.table #289

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add regression tests for summary.simtrial_gs_wlr() #282

Add regression tests for summary.simtrial_gs_wlr() #282

jdblischak commented Sep 13, 2024

jdblischak commented Sep 13, 2024

jdblischak commented Sep 13, 2024

jdblischak commented Sep 13, 2024

LittleBeannie commented Sep 17, 2024

jdblischak commented Sep 19, 2024

jdblischak commented Sep 19, 2024

jdblischak commented Sep 19, 2024

LittleBeannie commented Sep 19, 2024

LittleBeannie commented Sep 19, 2024

jdblischak commented Sep 19, 2024

LittleBeannie commented Sep 19, 2024

nanxstats commented Sep 19, 2024

jdblischak commented Sep 20, 2024

jdblischak commented Sep 23, 2024

LittleBeannie commented Sep 23, 2024

jdblischak commented Sep 23, 2024

LittleBeannie commented Sep 23, 2024

jdblischak commented Sep 24, 2024

LittleBeannie left a comment

Add regression tests for summary.simtrial_gs_wlr() #282

Add regression tests for summary.simtrial_gs_wlr() #282

Conversation

jdblischak commented Sep 13, 2024

jdblischak commented Sep 13, 2024

jdblischak commented Sep 13, 2024

jdblischak commented Sep 13, 2024

LittleBeannie commented Sep 17, 2024

jdblischak commented Sep 19, 2024

jdblischak commented Sep 19, 2024

jdblischak commented Sep 19, 2024

LittleBeannie commented Sep 19, 2024

LittleBeannie commented Sep 19, 2024

jdblischak commented Sep 19, 2024

LittleBeannie commented Sep 19, 2024

nanxstats commented Sep 19, 2024

jdblischak commented Sep 20, 2024

jdblischak commented Sep 23, 2024

LittleBeannie commented Sep 23, 2024

jdblischak commented Sep 23, 2024

LittleBeannie commented Sep 23, 2024

jdblischak commented Sep 24, 2024

LittleBeannie left a comment

Choose a reason for hiding this comment