-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Excel precision #308
Comments
Well, this quickly became more complex than I'd hoped 😦 > library(openxlsx2)
> # this is what I want to get
> chr <- "433256.529740090366"
> chr
[1] "433256.529740090366"
> # from this number
> num <- 433256.529740090366
> num
[1] 433256.5 What I had assumedWe write this tiny data frame and the value is just this. We open the file in Excel. We save it. The value magically changes. Solution: Excel is to blame. Unfortunately this is not entirely correct. What actually happens> wb <- wb_workbook()$add_worksheet()$add_data(x = num)
> wb$worksheets[[1]]$sheet_data$cc$v
[1] "433256.52974009" This is a bit strange, why does this happen? Because we push the number through Line 234 in f6677bf
This is completely unaffected by What to do?Hence, the least thing we should do, is document this behaviour in the package. Unless whoever writes to Excel, converts their values to strings with a specific precision, there is no way it will survive. Another possible way would be to provide a "nsmall" option and use What happens on the Excel side of things?Luckily, Excel doesn't do any better. If I simply copy & paste "433256.52974009002" So there is some truncation going on in Excel as well (as stated by me in the initial SO). And I'm unable to return the final > wb$worksheets[[1]]$sheet_data$cc
r row_r c_r c_s c_t c_cm c_ph c_vm v f f_t f_ref f_ca f_si is
1 A1 1 A 433256.52974009002
2 B1 1 B 1 4.3325652974009E+16 A1*100000000000 Now, when converting this value into a number, the following happens in R: > options(scipen = 200)
> as.numeric("433256.52974009002") * 1e12
[1] 433256529740090048 Ergo we are unable to recreate Excels precision and we should document this as well. |
You know it's complicated when there's an entire Wikipedia article on it: Numeric precision in Microsoft Excel. For data going into Excel, can we just supply how R interprets the values and allow Excel to perform its own truncation? If we're writing these as strings, maybe leveraging that For data coming out of Excel, maybe we can take from |
Well reading is not affected by this. If all you want to do is adding another sheet, there is no conversion to numeric. The conversion only occurs, if we convert the cell character to numeric. So truncation would only occur, if a cell is replaced with a numeric. I tend to not change reading from xlsx and converting to R data frame. The part where we write numbers can use format, that should be simple to integrate. I can open a pull request soon™ |
|
I've pushed a pull request that uses formats, though I'm not convinced, that it is the right solution. As can be seen, we blow up our floating point data quite a bit, even with shorter digits. The underlying issue ofc is that with floating points it's always the question how many digits are required and from what point on is it all just fictional. After all we need to make sure that the more digits we write actually serve a benefit and that we are not only blowing file size up to pad us on the back for our great precision, while opening the door for potential other corner cases. > options(digits = 22)
> #' Convert number to character
> #' @param x input integer or numeric
> #' @noRd
> as_character <- function(x) {
+ is_na <- is.na(x) & !is.nan(x)
+ z <- format(x, trim = TRUE, digits = options()$digits, format = "g")
+ if (any(is_na)) z[is_na] <- NA_character_
+ z
+ }
> # want a bunch of different digits?
> sprintf("%.20f", as.numeric(openxlsx2:::as_character(3123231.32455432123)))
[1] "3123231.32455432135611772537"
> openxlsx2:::as_character(3123231.32455432123)
[1] "3123231.324554320424795" |
See this SO
The text was updated successfully, but these errors were encountered: