Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closes #142 issue_142_updated to account for DT, DTM, TM variables #145

Merged
merged 18 commits into from
Jun 15, 2023
10 changes: 8 additions & 2 deletions R/type.R
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ xportr_type <- function(.df, metacore = NULL, domain = NULL,
type_name <- getOption("xportr.type_name")
characterTypes <- c(getOption("xportr.character_types"), "_character")
numericTypes <- c(getOption("xportr.numeric_types"), "_numeric")
formats <- metacore[['var_spec']]
bms63 marked this conversation as resolved.
Show resolved Hide resolved

## Common section to detect domain from argument or pipes

Expand Down Expand Up @@ -103,13 +104,18 @@ xportr_type <- function(.df, metacore = NULL, domain = NULL,
function(x, i, is_correct) {
if (!is_correct[i]) {
orig_attributes <- attributes(.df[[i]])
orig_attributes$class <- NULL
if (correct_type[i] %in% characterTypes) {
.df[[i]] <<- as.character(.df[[i]])
} else {
.df[[i]] <<- as.numeric(.df[[i]])
.df[[i]] <<- as.numeric(.df[[i]])
}

if (grepl('DT$|DTM$|TM$', colnames(df[i])) &
!is.na(formats[as.vector(formats$variable) == colnames(df[i]), 'format'][['format']])) {
attributes(.df[[i]]) <<- orig_attributes
} else {
attributes(.df[[i]]) <- NULL
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data are still converted to numeric R class. The attribute should numeric but the R class should stay date/dttm/time. The numeric date variables should not go through ".df[[i]] <<- as.numeric(.df[[i]])"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your updates. Numeric dates are still coerced to dbl, it should not be the case.

@cpiraux can you give an example dataframe where this happens and what the expected result should be?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that it is actually not coerced but the message is still present. I used df and metadata in test-type for the review:
image
image
image

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fixed now @cpiraux

@elimillera can you take a look too? I've implemented your suggestion in today's meeting. Thanks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't use a metacore object but a dataframe for the metadata

metadata <- data.frame(
  dataset = c("adsl", "adsl", "adsl", "adsl"),
  variable = c("USUBJID", "DMDTC", "RFICDT", "RFICDTM"),
  type = c("text", "date", "integer", "integer"),
  format = c(NA, NA, "date9.", "datetime15.")
)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jump in the conversation

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went through and everything looked ok from the code side. I would rely on others to make sure it behaves as expected. I'll be sure to update this in the type.R documentation

I updated the testthat for this function and I see a new issue where in line 205 of test-type.R, the original df and xpt file have different timezones.

── Failure (test-type.R:208:3): xportr_type: date variables are not converted to numeric ──
df$RFICDTM (actual) not equal to df_xpt$RFICDTM (expected).

actual: 1490824800
expected: 1490832000

image

I checked the documentation and it comes from this parameter in the write_xpt function of haven:

image

Is this something to be concerned about? @cpiraux @elimillera @kaz462

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this! Timezone should be ignored. As the default value in haven for adjust_tz = TRUE, I think it is okay. I open dfdates.xpt in SAS and dates are okay. Is it possible to ignore the timezone when you compare the dates in testthat?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cpiraux test updated to expect that the as_character() of each datetime are equal so that timezone is ignored.

}
}, is_correct
)
Expand Down