Skip to content

Commit

Permalink
[write] skip blanks (#1111)
Browse files Browse the repository at this point in the history
* [write] skip entirely blank cells

* [tests] update tests

* update NEWS
  • Loading branch information
JanMarvin committed Sep 11, 2024
1 parent 25232fd commit 5777f4b
Show file tree
Hide file tree
Showing 4 changed files with 46 additions and 5 deletions.
4 changes: 4 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# openxlsx2 (development version)

## New features

* When writing a file with `na.strings = NULL`, the file will not contain any reference to the blank cell. Depending on the number of missings in a data set, this can reduce the file size significantly. [1111](https://github.com/JanMarvin/openxlsx2/pull/1111)

## Fixes

* The integration of the shared formula feature in the previous release broke the silent extension of dims, if a single cell `dims` was provided for an `x` that was larger than a single cell in `wb_add_formula()`. [1131](https://github.com/JanMarvin/openxlsx2/pull/1131)
Expand Down
23 changes: 20 additions & 3 deletions src/write_file.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,26 @@ pugi::xml_document xml_sheet_data(Rcpp::DataFrame row_attr, Rcpp::DataFrame cc)
}
}

// update lastrow
lastrow = thisrow;

if ( // skip blank cells entirely
cc_c_s[i] == "" &&
cc_c_t[i] == "" &&
cc_c_cm[i] == "" &&
cc_c_ph[i] == "" &&
cc_c_vm[i] == "" &&
cc_v[i] == "" &&
cc_f[i] == "" &&
cc_f_t[i] == "" &&
cc_f_ref[i] == "" &&
cc_f_ca[i] == "" &&
cc_f_si[i] == "" &&
cc_is[i] == ""
) {
continue;
}

// create node <c>
pugi::xml_node cell = row.append_child("c");

Expand Down Expand Up @@ -170,9 +190,6 @@ pugi::xml_document xml_sheet_data(Rcpp::DataFrame row_attr, Rcpp::DataFrame cc)
cell.append_copy(is_node.first_child());
}
}

// update lastrow
lastrow = thisrow;
}

return doc;
Expand Down
2 changes: 1 addition & 1 deletion tests/testthat/test-save.R
Original file line number Diff line number Diff line change
Expand Up @@ -281,7 +281,7 @@ test_that("write cells without data", {
expect_equal(exp, got)

sheet <- paste0(tmp_dir, "/xl/worksheets/sheet1.xml")
exp <- "<sheetData><row r=\"2\"><c r=\"B2\"/><c r=\"C2\"/></row><row r=\"3\"><c r=\"B3\"/><c r=\"C3\"/></row></sheetData>"
exp <- "<sheetData><row r=\"2\"/><row r=\"3\"/></sheetData>"
got <- xml_node(sheet, "worksheet", "sheetData")
expect_equal(exp, got)

Expand Down
22 changes: 21 additions & 1 deletion tests/testthat/test-write.R
Original file line number Diff line number Diff line change
Expand Up @@ -809,7 +809,7 @@ test_that("writing na.strings = NULL works", {
write_xlsx(matrix(NA, 2, 2), tmp, na.strings = NULL)
wb <- wb_load(tmp)

exp <- ""
exp <- NA_character_
got <- unique(wb$worksheets[[1]]$sheet_data$cc$v[3:6])
expect_equal(exp, got)

Expand Down Expand Up @@ -1239,3 +1239,23 @@ test_that("sheet is a valid argument in write_xlsx", {
wb2 <- write_xlsx(x = mtcars, sheet = "data")
expect_equal(wb1$get_sheet_names(), wb2$get_sheet_names())
})

test_that("skipping entirely blank cells works", {

tp <- temp_xlsx()

mm <- matrix(1:9, 3, 3)
mm[diag(mm)] <- NA

wb1 <- write_xlsx(x = mm, file = tp, col_names = FALSE, na.strings = NULL)
cc1 <- wb1$worksheets[[1]]$sheet_data$cc
got1 <- cc1[cc1$r %in% c("A1", "B2", "C3"), ]

wb2 <- wb_load(tp)
cc2 <- wb2$worksheets[[1]]$sheet_data$cc
got2 <- cc2[cc2$r %in% c("A1", "B2", "C3"), ]

expect_equal(3, nrow(got1))
expect_equal(0, nrow(got2))

})

0 comments on commit 5777f4b

Please sign in to comment.