Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update script to generate MBdelayed.rda #121

Merged
merged 2 commits into from
Nov 3, 2023
Merged

Conversation

jdblischak
Copy link
Collaborator

I updated data-raw/DATASET.R to use the new snakecase function names and lowercase column names. However, I was unable to reproduce the existing object data/MBdelayed.rda. My updated version has no events.

Does anyone have any ideas on how to fix the script so that it can reproduce the existing object?

Here are some diagnostics:

# current MBdelayed
summary(MBdelayed)
##       tte               event         Stratum           Treatment        
##  Min.   : 0.09171   Min.   :0.000   Length:200         Length:200        
##  1st Qu.: 5.85919   1st Qu.:0.000   Class :character   Class :character  
##  Median :14.49925   Median :1.000   Mode  :character   Mode  :character  
##  Mean   :16.79899   Mean   :0.705                                        
##  3rd Qu.:27.37013   3rd Qu.:1.000                                        
##  Max.   :35.42239   Max.   :1.000           
str(MBdelayed)
## gropd_df [200 × 4] (S3: grouped_df/tbl_df/tbl/data.frame)
##  $ tte      : num [1:200] 4.68 6.2 35.42 32.74 5.38 ...
##  $ event    : num [1:200] 1 1 1 1 1 1 1 1 1 1 ...
##  $ Stratum  : chr [1:200] "All" "All" "All" "All" ...
##  $ Treatment: chr [1:200] "Experimental" "Control" "Control" "Experimental" ...
##  - attr(*, "groups")= tibble [2 × 3] (S3: tbl_df/tbl/data.frame)
##   ..$ Stratum  : chr [1:2] "All" "All"
##   ..$ Treatment: chr [1:2] "Control" "Experimental"
##   ..$ .rows    :List of 2
##   .. ..$ : int [1:100] 2 3 7 8 9 10 14 15 17 18 ...
##   .. ..$ : int [1:100] 1 4 5 6 11 12 13 16 19 20 ...
##   ..- attr(*, ".drop")= logi TRUE
table(MBdelayed[, c("event", "Stratum", "Treatment")])
## , , Treatment = Control
## 
##      Stratum
## event All
##     0  20
##     1  80
## 
## , , Treatment = Experimental
## 
##      Stratum
## event All
##     0  39
##     1  61

# latest MBdelayed
summary(MBdelayed)
##       tte            event     stratum           treatment        
##  Min.   :24.00   Min.   :0   Length:200         Length:200        
##  1st Qu.:27.11   1st Qu.:0   Class :character   Class :character  
##  Median :29.76   Median :0   Mode  :character   Mode  :character  
##  Mean   :29.88   Mean   :0                                        
##  3rd Qu.:32.71   3rd Qu.:0                                        
##  Max.   :36.30   Max.   :0                   
str(MBdelayed)
## 'data.frame':	200 obs. of  4 variables:
##  $ tte      : num  36.3 36.3 36.1 36.1 36 ...
##  $ event    : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ stratum  : chr  "All" "All" "All" "All" ...
##  $ treatment: chr  "experimental" "control" "control" "experimental" ...
table(MBdelayed[, c("event", "stratum", "treatment")])
## , , treatment = control
## 
##      stratum
## event All
##     0 100
## 
## , , treatment = experimental
## 
##      stratum
## event All
##     0 100

@nanxstats
Copy link
Collaborator

I can get proper events using an older GitHub release version of simtrial:

remotes::install_github("Merck/simtrial@v0.2.2")

then run data-raw/DATASET.R. This might help the triage the specific commit(s) that bring the changed behavior. I guess It won't be too difficult to figure out - possibly related to the earlier refactoring effort this Spring.

@jdblischak
Copy link
Collaborator Author

jdblischak commented Nov 3, 2023

Confirmed. I was able to run the code from the tag v0.2.2:

table(MBdelayed[, c("event", "Stratum", "Treatment")])
, , Treatment = Control

     Stratum
event All
    0  23
    1  77

, , Treatment = Experimental

     Stratum
event All
    0  38
    1  62

However, the object data/MBdelayed.rda was still modified. And note that the numbers don't match the current data set. I suspect that potentially the seed wasn't set the last time it was updated

data-raw/DATASET.R Outdated Show resolved Hide resolved
@jdblischak jdblischak marked this pull request as ready for review November 3, 2023 17:48
@jdblischak
Copy link
Collaborator Author

I also added code to fail early if the input to sim_pw_surv() has mismatched treatment names

Copy link
Collaborator

@nanxstats nanxstats left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice - great to have this dataset updated and the functions safe guarded.

@nanxstats nanxstats merged commit e9f2f71 into Merck:main Nov 3, 2023
7 checks passed
@jdblischak jdblischak deleted the MBdelayed branch November 3, 2023 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants