[write] table column names have to be characters #1071

JanMarvin · 2024-07-02T20:07:07Z

The condition is somehow triggered by factor variables. Our logic tries to convert this to strings, but it somehow goes a little further and tries to write column headers as numbers too. This breaks in a table, because table headers require character type colnames.

JanMarvin · 2024-07-02T21:27:17Z

src/helper_functions.cpp

@@ -524,8 +526,13 @@ void wide_to_long(
      if (ref_str.compare("0") == 0)
      ref_str = col + row;

-      // factors can be numeric or string or both
-      if (vtyp == factor) string_nums = true;


Previously, we never reset string_nums when a vtype factor was present.

JanMarvin · 2024-07-02T21:37:37Z

Example from this SO

library(openxlsx2)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

# workbook for output
wb2 <- wb_workbook()$add_worksheet("PERS35")

# fl <- "https://duo.nl/open_onderwijsdata/images/02.-onderwijspersoneel-po-in-fte-2011-2023.xlsx"
fl <- "~/Downloads/02.-onderwijspersoneel-po-in-fte-2011-2023.xlsx"

# maybe this sheet?
df_fte <- wb_to_df(fl, sheet = "owtype-best-instelling-functie") %>% 
  mutate(across(where(is.character), ~ na_if(., "*")))

# some data wrangling. Throws a warning ... ???
df_fte <- df_fte %>% 
  mutate_at(vars(2:274), as.numeric)%>% 
  select(1:222) %>% 
  tidyr::pivot_longer(cols = matches("\\d{4}$"), names_to = "JAAR_", values_to = "AANTAL") %>%
  mutate(CATEGORIE = stringr::str_sub(JAAR_, end=-5)) %>% 
  mutate(JAAR = stringr::str_sub(JAAR_, start=-4)) %>% 
  group_by(ONDERWIJSTYPE, CATEGORIE, JAAR) %>% 
  summarise(AANTAL = sum(AANTAL, na.rm=TRUE))
#> Warning: There were 2 warnings in `mutate()`.
#> The first warning was:
#> ℹ In argument: `INSTELLINGSCODE = .Primitive("as.double")(INSTELLINGSCODE)`.
#> Caused by warning:
#> ! NAs introduced by coercion
#> ℹ Run `dplyr::last_dplyr_warnings()` to see the 1 remaining warning.
#> `summarise()` has grouped output by 'ONDERWIJSTYPE', 'CATEGORIE'. You can
#> override using the `.groups` argument.

# some more data wrangling
df_pers35 <- 
  df_fte %>% 
  mutate(CATEGORIE=
           factor(
             case_when(
               (CATEGORIE=="FTE'S PERSONEN JONGER DAN 15 JAAR "| 
                  CATEGORIE=="FTE'S PERSONEN 15 - 25 JAAR ")           ~ "JONGER DAN 25 JAAR",
               CATEGORIE=="FTE'S PERSONEN 25 - 35 JAAR "            ~ "25 - 35 JAAR",
               CATEGORIE=="FTE'S PERSONEN 35 - 45 JAAR "            ~ "35 - 45 JAAR",
               CATEGORIE=="FTE'S PERSONEN 45 - 55 JAAR "            ~ "45 - 55 JAAR",
               CATEGORIE=="FTE'S PERSONEN 55 - 65 JAAR "            ~ "55 - 65 JAAR",
               CATEGORIE=="FTE'S PERSONEN 65 JAAR EN OUDER "        ~ "65 JAAR EN OUDER"), 
             levels=c("JONGER DAN 25 JAAR", "25 - 35 JAAR", "35 - 45 JAAR", "45 - 55 JAAR", "55 - 65 JAAR", "65 JAAR EN OUDER")
           )
  ) %>% 
  filter(CATEGORIE=="JONGER DAN 25 JAAR"|
           CATEGORIE=="25 - 35 JAAR"|
           CATEGORIE=="35 - 45 JAAR"|
           CATEGORIE=="45 - 55 JAAR"|
           CATEGORIE=="55 - 65 JAAR"|
           CATEGORIE=="65 JAAR EN OUDER") %>%
  group_by(CATEGORIE, JAAR) %>% 
  summarise(aantal=round(sum(AANTAL, na.rm=TRUE),1), .groups = 'drop') %>% 
  tidyr::spread(JAAR, aantal)

if (is.null(wb_to_df(wb2, sheet = "PERS35", dims = wb_dims(cols = "B", rows = 2)))){
  wb2 <- wb_add_data_table(
    wb = wb2,
    x = df_pers35,
    dims = "B2",
    banded_rows = TRUE,
    table_style = "TableStyleLight16"
  ) %>% 
    wb_add_fill(sheet = "PERS35", dims = "B2:B8", color = wb_color("green"))%>% 
    wb_add_fill(sheet = "PERS35", dims = "K2:O8", color = wb_color("green"))
}
#> sheet found, but contains no data

if (interactive()) wb2$open()

JanMarvin · 2024-07-02T21:42:11Z

I merge this PR, but maybe if you want to do tests prior to the release, handling factors and string_nums option might be something to toy around with, @olivroy . I tend to avoid factors wherever I can, therefore probably this one slipped through.

JanMarvin force-pushed the fix_table_colname branch 2 times, most recently from e7b2c74 to 6ed4340 Compare July 2, 2024 21:25

JanMarvin commented Jul 2, 2024

View reviewed changes

[write] table column names have to be characters

de88436

JanMarvin force-pushed the fix_table_colname branch from 6ed4340 to de88436 Compare July 2, 2024 21:28

JanMarvin merged commit e639220 into main Jul 2, 2024
9 checks passed

JanMarvin deleted the fix_table_colname branch July 2, 2024 21:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[write] table column names have to be characters #1071

[write] table column names have to be characters #1071

JanMarvin commented Jul 2, 2024 •

edited

Loading

JanMarvin Jul 2, 2024

JanMarvin commented Jul 2, 2024

JanMarvin commented Jul 2, 2024

[write] table column names have to be characters #1071

[write] table column names have to be characters #1071

Conversation

JanMarvin commented Jul 2, 2024 • edited Loading

JanMarvin Jul 2, 2024

Choose a reason for hiding this comment

JanMarvin commented Jul 2, 2024

JanMarvin commented Jul 2, 2024

JanMarvin commented Jul 2, 2024 •

edited

Loading