Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

most recent version of Microsoft Excel for Mac does export CSV files with "\r" as the line ending #77

Closed
cjyetman opened this issue Feb 2, 2024 · 5 comments · Fixed by #78

Comments

@cjyetman
Copy link
Member

cjyetman commented Feb 2, 2024

On Microsoft Excel for Mac v16.81 (presumably the most recent version on my work laptop), when exporting a CSV using the "Macintosh Comma Separated (.csv)" option to export, it creates a file with only "\r" as the line ending (interestingly, in this particular case, it does not add any line ending or newline to the last line).

Similar behavior has been reported over the years (2013-2020)
https://nicercode.github.io/blog/2013-04-30-excel-and-line-endings/
https://answers.microsoft.com/en-us/msoffice/forum/all/excel-office-365-mac-not-saving-utf-8-csv-files/ba44fced-7da7-418c-b230-f24726b8ecf7
https://developmentality.wordpress.com/2010/12/06/excel-2008-for-macs-csv-bug/

Exporting from Excel in this way is something we have advised COP members to do in the recent past.

This suggests to me that getting CSV files that have a line ending with only "\r" at the end is a very real possibility.

Because of this, I think we need to revert #65 and part of #58.

@AlexAxthelm @jdhoffa

@jdhoffa
Copy link
Member

jdhoffa commented Feb 2, 2024

This is very much not my area of expertise! haha

@AlexAxthelm
Copy link
Collaborator

Every time I think things are okay, Excel provides new ways of making life difficult.

this is fine, dog on fire

@jdhoffa
Copy link
Member

jdhoffa commented Feb 2, 2024

For posterity, @cjyetman's comment warned of exactly this: #65 (review)

@AlexAxthelm
Copy link
Collaborator

@cjyetman When I try to run a CSV generated with excel (Attached Book1.csv), I get warnings, but it still returns a valid portfolio data.frame.

The warnings appear to be coming from the new logic in guess_numerical_mark() implemented in #75 to deal with one-line files.

only_one_line <- length(readLines(con = filepath, n = 2L)) < 2

it's the same "readLines() expects a trailing \n" issue that prompted the newline check in the first place. Not sure what the best way to deal with this is, but I think file_has_newline_at_end() returning FALSE for this file is the correct behavior.

[ins] R4.3.2> library(pacta.portfolio.import)

[ins] R4.3.2> read_portfolio_csv("~/Downloads/Book1.csv")
# A tibble: 1 × 3
  isin         market_value currency
  <chr>               <dbl> <chr>
1 XS1088274672        1000. CHF
Warning messages:
1: In readLines(con = filepath, n = 2L) :
  incomplete final line found on '/Users/aaxthelm/Downloads/Book1.csv'
2: In readLines(con = filepath, n = 2L) :
  incomplete final line found on '/Users/aaxthelm/Downloads/Book1.csv'

[ins] R4.3.2> options(warn = 2, error = utils::recover)

[ins] R4.3.2> read_portfolio_csv("~/Downloads/Book1.csv")
Error in readLines(con = filepath, n = 2L) :
  (converted from warning) incomplete final line found on '/Users/aaxthelm/Downloads/Book1.csv'

Enter a frame number, or 0 to exit

 1: read_portfolio_csv("~/Downloads/Book1.csv")
 2: vapply(X = filepaths, FUN = function(filepath) {
    pb$tick(tokens = list(what = basename(filepath)))
    if (!is_text_file(filepat
 3: FUN(X[[i]], ...)
 4: guess_decimal_mark(filepath)
 5: guess_numerical_mark(filepaths, type = "decimal")
 6: vapply(X = filepaths, FUN = function(filepath) {
    if (!is_file_accessible(filepath) || !is_text_file(filepath)) {
        return(
 7: FUN(X[[i]], ...)
 8: readLines(con = filepath, n = 2)
 9: .signalSimpleWarning("incomplete final line found on '/Users/aaxthelm/Downloads/Book1.csv'", base::quote(readLines(con = filepath,
10: withRestarts({
    .Internal(.signalCondition(simpleWarning(msg, call), msg, call))
    .Internal(.dfltWarn(msg, call))
}, muffleWarn
11: withOneRestart(expr, restarts[[1]])
12: doWithOneRestart(return(expr), restart)


[ins] Selection: 0
[ins] R4.3.2> pacta.portfolio.import:::file_has_newline_at_end("~/Downloads/Book1.csv")
[1] FALSE

[ins] R4.3.2>

@cjyetman
Copy link
Member Author

cjyetman commented Feb 2, 2024

I'm aware of the warning, but it's a small price to pay to maintain previous behavior, especially with an impending COP project, and fairly certain memory of having seen files like this in the wild. I'll open an issue to squelch the warning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants