Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add regression tests for summary.simtrial_gs_wlr() #282

Merged
merged 5 commits into from
Sep 24, 2024

Conversation

jdblischak
Copy link
Collaborator

I am starting to convert summary.simtrial_gs_wlr() to data.table (https://github.com/Merck/simtrial/pull/268/files#r1727564684). In order to ensure I don't change the results, I need regression tests.

@jdblischak jdblischak self-assigned this Sep 13, 2024
@jdblischak
Copy link
Collaborator Author

Unfortunately our coverage results were not uploaded to Codecov:

[1] "Rate limit reached. Please upload with the Codecov repository upload token to resolve issue. Expected time to availability: 2408s.

@jdblischak
Copy link
Collaborator Author

I ran covr locally

x <- covr::package_coverage()
covr::report(x)

I converted the examples to tests, but there are large gaps in the code coverage. Thus I can't know if my conversion to data.table is being done correctly. @LittleBeannie could you please provide example code that uses these code paths?

image

image

image

image

@jdblischak
Copy link
Collaborator Author

Also, in the example code, there are no rows where cross_upper = -z >= upper_bound is TRUE, and thus the entire pipe below evaluates to an empty table. Could you please provide an example where sim_upper_prob is not returned as NA?

simtrial/R/summary.R

Lines 103 to 113 in 5fe51b9

ans2 <- object |>
dplyr::left_join(data.frame(analysis = 1:n_analysis, upper_bound = bound)) |>
dplyr::mutate(cross_upper = -z >= upper_bound) |>
dplyr::filter(cross_upper == TRUE) |>
dplyr::group_by(sim_id) |>
dplyr::filter(dplyr::row_number() == 1) |>
dplyr::ungroup() |>
dplyr::group_by(analysis) |>
dplyr::summarize(n_cross_upper = dplyr::n()) |>
dplyr::mutate(sim_upper_prob = cumsum(n_cross_upper) / n_sim) |>
dplyr::select(analysis, sim_upper_prob)

@LittleBeannie
Copy link
Collaborator

Hi @jdblischak, could you please confirm if the following example is something you are looking for?

library(gsDesign2)
library(simtrial)

# Parameters for enrollment
enroll_rampup_duration <- 4 # Duration for enrollment ramp up
enroll_duration <- 16 # Total enrollment duration
enroll_rate <- define_enroll_rate(
  duration = c(
    enroll_rampup_duration, enroll_duration - enroll_rampup_duration),
  rate = c(10, 30))

# Parameters for treatment effect
delay_effect_duration <- 3 # Delay treatment effect in months
median_ctrl <- 9 # Survival median of the control arm
median_exp <- c(9, 14) # Survival median of the experimental arm
dropout_rate <- 0.001
fail_rate <- define_fail_rate(
  duration = c(delay_effect_duration, 100),
  fail_rate = log(2) / median_ctrl,
  hr = median_ctrl / median_exp,
  dropout_rate = dropout_rate)

# Other related parameters
alpha <- 0.025 # Type I error
beta <- 0.1 # Type II error
ratio <- 1 # Randomization ratio (experimental:control)

# Build a one-sided group sequential design
design <- gs_design_ahr(
  enroll_rate = enroll_rate, fail_rate = fail_rate,
  ratio = ratio, alpha = alpha, beta = beta,
  analysis_time = c(12, 24, 36),
  upper = gs_spending_bound,
  upar = list(sf = gsDesign::sfLDOF, total_spend = alpha),
  lower = gs_spending_bound,
  lpar = list(sf = gsDesign::sfLDOF, total_spend = beta))

# Define cuttings of 2 IAs and 1 FA
ia1_cut <- create_cut(target_event_overall = ceiling(design$analysis$event[1]))
ia2_cut <- create_cut(target_event_overall = ceiling(design$analysis$event[2]))
fa_cut <- create_cut(target_event_overall = ceiling(design$analysis$event[3]))

# Run simulations
simulation <- sim_gs_n(
  n_sim = 3,
  sample_size = ceiling(design$analysis$n[3]),
  enroll_rate = design$enroll_rate,
  fail_rate = design$fail_rate,
  test = wlr,
  cut = list(ia1 = ia1_cut, ia2 = ia2_cut, fa = fa_cut),
  weight = fh(rho = 0, gamma = 0.5))

# Summarize simulations
simulation |> summary(bound = gsDesign::gsDesign(k = 3, test.type = 1, sfu = gsDesign::sfLDOF)$upper$bound)

# Summarize simulation and compare with the planned design
simulation |> summary(design = design)

@jdblischak
Copy link
Collaborator Author

could you please confirm if the following example is something you are looking for?

Yes, exactly! I ran {covr} locally, and this example uses the remaining code paths.

Comparing your example above to the current example in ?summary.simtrial_gs_wlr, it looks like the only difference is the arguments lower and lpar passed to gs_design_ahr():

# Build a one-sided group sequential design
design <- gs_design_ahr(
  enroll_rate = enroll_rate, fail_rate = fail_rate,
  ratio = ratio, alpha = alpha, beta = beta,
  analysis_time = c(12, 24, 36),
  upper = gs_spending_bound,
  upar = list(sf = gsDesign::sfLDOF, total_spend = alpha),
  lower = gs_spending_bound,
  lpar = list(sf = gsDesign::sfLDOF, total_spend = beta))

simtrial/R/summary.R

Lines 58 to 66 in 506ce7d

#' # Build a one-sided group sequential design
#' design <- gs_design_ahr(
#' enroll_rate = enroll_rate, fail_rate = fail_rate,
#' ratio = ratio, alpha = alpha, beta = beta,
#' analysis_time = c(12, 24, 36),
#' upper = gs_spending_bound,
#' upar = list(sf = gsDesign::sfLDOF, total_spend = alpha),
#' lower = gs_b,
#' lpar = rep(-Inf, 3))

Is my understanding correct? Also, should the comment Build a one-sided group sequential design be updated to Build a two-sided group sequential design?

@jdblischak
Copy link
Collaborator Author

However, in that latest example, the object ans2 still has zero rows. How do we know that the code below is behaving properly?

simtrial/R/summary.R

Lines 103 to 113 in 5fe51b9

ans2 <- object |>
dplyr::left_join(data.frame(analysis = 1:n_analysis, upper_bound = bound)) |>
dplyr::mutate(cross_upper = -z >= upper_bound) |>
dplyr::filter(cross_upper == TRUE) |>
dplyr::group_by(sim_id) |>
dplyr::filter(dplyr::row_number() == 1) |>
dplyr::ungroup() |>
dplyr::group_by(analysis) |>
dplyr::summarize(n_cross_upper = dplyr::n()) |>
dplyr::mutate(sim_upper_prob = cumsum(n_cross_upper) / n_sim) |>
dplyr::select(analysis, sim_upper_prob)

@jdblischak
Copy link
Collaborator Author

Unfortunately our coverage results were not uploaded to Codecov:

It looks like the last time our Codecov results were uploaded was 2 months ago

@nanxstats is it possible to define CODECOV_TOKEN to fix this? (r-lib/actions#834, Merck/r2rtf#221). If not, then I think we should delete the coverage workflow.

@LittleBeannie
Copy link
Collaborator

Is my understanding correct?

Yes, you are right that the difference lies in the lower and lpar.

Also, should the comment Build a one-sided group sequential design be updated to Build a two-sided group sequential design?

No. The example has lower = gs_b and lpar=rep(-Inf, 3), which gives you a one-sided design. Only when the futility is not all -inf, we call it two-sided design.

@LittleBeannie
Copy link
Collaborator

However, in that latest example, the object ans2 still has zero rows. How do we know that the code below is behaving properly?

simtrial/R/summary.R

Lines 103 to 113 in 5fe51b9

ans2 <- object |>
dplyr::left_join(data.frame(analysis = 1:n_analysis, upper_bound = bound)) |>
dplyr::mutate(cross_upper = -z >= upper_bound) |>
dplyr::filter(cross_upper == TRUE) |>
dplyr::group_by(sim_id) |>
dplyr::filter(dplyr::row_number() == 1) |>
dplyr::ungroup() |>
dplyr::group_by(analysis) |>
dplyr::summarize(n_cross_upper = dplyr::n()) |>
dplyr::mutate(sim_upper_prob = cumsum(n_cross_upper) / n_sim) |>
dplyr::select(analysis, sim_upper_prob)

Is the "lastest example" one-sided or two-sided?

@jdblischak
Copy link
Collaborator Author

Is the "lastest example" one-sided or two-sided?

By "latest example" I mean the one you shared 2 days ago above in #282 (comment). I'm fairly confident it is an example of a two-sided design, but the code comment states it is one-sided

@LittleBeannie
Copy link
Collaborator

Is the "lastest example" one-sided or two-sided?

By "latest example" I mean the one you shared 2 days ago above in #282 (comment). I'm fairly confident it is an example of a two-sided design, but the code comment states it is one-sided

You're correct that it's a two-sided example. I was copying and pasting, and I was too fixated on the code, so I overlooked the comments...

@nanxstats
Copy link
Collaborator

It looks like we will need an organization token for the authenticated upload to work. If that's the case, reaching out to the open source committee and see if they can help would a possible next move.

Our R SDLC recommends showing a coverage badge so it might be useful to keep it. I've had anecdotal observations that the code coverage workflow can still upload successfully once a while without a codecov token but not sure if that's helpful.

Unfortunately our coverage results were not uploaded to Codecov:

It looks like the last time our Codecov results were uploaded was 2 months ago

@nanxstats is it possible to define CODECOV_TOKEN to fix this? (r-lib/actions#834, Merck/r2rtf#221). If not, then I think we should delete the coverage workflow.

@jdblischak
Copy link
Collaborator Author

I've had anecdotal observations that the code coverage workflow can still upload successfully once a while without a codecov token but not sure if that's helpful.

Let's leave the workflow for now then. The last coverage update was 2 months ago, so it is still at least somewhat accurate. However, if we get to 6+ months of no coverage update, then I would encourage us to remove the workflow and the README badge, since from my perspective an inaccurate coverage badge is worse than no coverage badge.

Our R SDLC recommends showing a coverage badge so it might be useful to keep it.

The SDLC is for clinical submissions, no? And looking through our internal docs, it looks like the recommended coverage badge links to a vignette with the saved coverage info (since presumably you don't want to upload these to Codecov).

It looks like we will need an organization token for the authenticated upload to work. If that's the case, reaching out to the open source committee and see if they can help would a possible next move.

I'll leave that decision to you. Having the coverage reports easily accessible, especially for PRs, is very convenient. But at the end of the day, I can run it locally myself.

@jdblischak
Copy link
Collaborator Author

@LittleBeannie I ran sim_gs_n() with n_sim = 10000 (and set.seed(1)), but I still never managed to find any rows with -z >= upper_bound. Could you please investigate and see if you can generate an example that has rows where cross_upper == TRUE?

 ans2 <- object |> 
   dplyr::left_join(data.frame(analysis = 1:n_analysis, upper_bound = bound)) |> 
   dplyr::mutate(cross_upper = -z >= upper_bound) |> 
   dplyr::filter(cross_upper == TRUE) |> 

@LittleBeannie
Copy link
Collaborator

@LittleBeannie I ran sim_gs_n() with n_sim = 10000 (and set.seed(1)), but I still never managed to find any rows with -z >= upper_bound. Could you please investigate and see if you can generate an example that has rows where cross_upper == TRUE?

 ans2 <- object |> 
   dplyr::left_join(data.frame(analysis = 1:n_analysis, upper_bound = bound)) |> 
   dplyr::mutate(cross_upper = -z >= upper_bound) |> 
   dplyr::filter(cross_upper == TRUE) |> 

Hi @jdblischak , could you provide me the sim_gs_n code you have run?

@jdblischak
Copy link
Collaborator Author

# Parameters for enrollment
enroll_rampup_duration <- 4 # Duration for enrollment ramp up
enroll_duration <- 16 # Total enrollment duration
enroll_rate <- gsDesign2::define_enroll_rate(
  duration = c(
    enroll_rampup_duration, enroll_duration - enroll_rampup_duration),
  rate = c(10, 30))

# Parameters for treatment effect
delay_effect_duration <- 3 # Delay treatment effect in months
median_ctrl <- 9 # Survival median of the control arm
median_exp <- c(9, 14) # Survival median of the experimental arm
dropout_rate <- 0.001
fail_rate <- gsDesign2::define_fail_rate(
  duration = c(delay_effect_duration, 100),
  fail_rate = log(2) / median_ctrl,
  hr = median_ctrl / median_exp,
  dropout_rate = dropout_rate)

# Other related parameters
alpha <- 0.025 # Type I error
beta <- 0.1 # Type II error
ratio <- 1 # Randomization ratio (experimental:control)

# Build a two-sided group sequential design
design <- gsDesign2::gs_design_ahr(
  enroll_rate = enroll_rate, fail_rate = fail_rate,
  ratio = ratio, alpha = alpha, beta = beta,
  analysis_time = c(12, 24, 36),
  upper = gsDesign2::gs_spending_bound,
  upar = list(sf = gsDesign::sfLDOF, total_spend = alpha),
  lower = gsDesign2::gs_spending_bound,
  lpar = list(sf = gsDesign::sfLDOF, total_spend = beta))

# Define cuttings of 2 IAs and 1 FA
ia1_cut <- create_cut(target_event_overall = ceiling(design$analysis$event[1]))
ia2_cut <- create_cut(target_event_overall = ceiling(design$analysis$event[2]))
fa_cut <- create_cut(target_event_overall = ceiling(design$analysis$event[3]))

# Run simulations
set.seed(1)
simulation <- sim_gs_n(
  n_sim = 10000,
  sample_size = ceiling(design$analysis$n[3]),
  enroll_rate = design$enroll_rate,
  fail_rate = design$fail_rate,
  test = wlr,
  cut = list(ia1 = ia1_cut, ia2 = ia2_cut, fa = fa_cut),
  weight = fh(rho = 0, gamma = 0.5))

# Summarize simulations
observed <- simulation |>
  summary(bound = gsDesign::gsDesign(k = 3, test.type = 1, sfu = gsDesign::sfLDOF)$upper$bound)

@LittleBeannie
Copy link
Collaborator

Hello @jdblischak , the issue is related to the sign of z. Previously, the z from fh(rho = 0, gamma = 0) was always a negative number. After the pull request at #272, the sign of z has changed to positive numbers. If you modify the code by removing the - before `z, I guess the issue will be resolved.

ans2 <- object |> 
   dplyr::left_join(data.frame(analysis = 1:n_analysis, upper_bound = bound)) |> 
   dplyr::mutate(cross_upper = z >= upper_bound) |> 
   dplyr::filter(cross_upper == TRUE) |> 

@jdblischak
Copy link
Collaborator Author

the issue is related to the sign of z. Previously, the z from fh(rho = 0, gamma = 0) was always a negative number. After the pull request at #272, the sign of z has changed to positive numbers.

@LittleBeannie Nice investigation! I'm glad we were able to find the root of the problem

If you modify the code by removing the - before `z, I guess the issue will be resolved.

Done in c4e7b66. I bumped the build number since this update changed the values returned for sim_lower_prob.

Once the CI passes, please review and merge. Thanks!

Copy link
Collaborator

@LittleBeannie LittleBeannie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @jdblischak!

@LittleBeannie LittleBeannie merged commit 58db5df into Merck:main Sep 24, 2024
7 checks passed
@jdblischak jdblischak deleted the test-summary branch September 24, 2024 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants