Merge pull request #8 from rostools/switch-to-list-rbind

Switch to list rbind
rostools · Jun 13, 2023 · 6d875be · 6d875be
2 parents 3527710 + 02ed859
commit 6d875be
Show file tree

Hide file tree

Showing 7 changed files with 195 additions and 172 deletions.
diff --git a/_variables.yml b/_variables.yml
@@ -1,7 +1,8 @@
+# Automatically created by `r3admin::copy_common_file('_variables.yml')` on 2023-06-07.
 keybind:
   palette: '{{< kbd linux=Ctrl-Shift-P mac=Cmd-Shift-P win=Ctrl-Shift-P >}}'
   git: '{{< kbd linux=Ctrl-Shift-M mac=Cmd-Shift-M win=Ctrl-Shift-M >}} or with the Palette ({{< var keybind.palette >}}, then type "commit")'
-  chunk: '{{< kbd linux=Ctrl-Shift-I mac=Cmd-Shift-I win=Ctrl-Shift-I >}} or with the Palette ({{< var keybind.palette >}}, then type "new chunk")'
+  chunk: '{{< kbd linux=Ctrl-Shift-I mac=Cmd-Option-I win=Ctrl-Shift-I >}} or with the Palette ({{< var keybind.palette >}}, then type "new chunk")'
   restart-r: '{{< kbd linux=Ctrl-Shift-F10 mac=Cmd-Shift-F10 win=Ctrl-Shift-F10 >}} or with the Palette ({{< var keybind.palette >}}, then type "restart")'
   source: '{{< kbd linux=Ctrl-Shift-S mac=Cmd-Shift-S win=Ctrl-Shift-S >}} or with the Palette ({{< var keybind.palette >}}, then type "source")'
   render: '{{< kbd linux=Ctrl-Shift-K mac=Cmd-Shift-K win=Ctrl-Shift-K >}} or with the Palette ({{< var keybind.palette >}}, then type "render")'

diff --git a/sessions/dplyr-joins.qmd b/sessions/dplyr-joins.qmd
@@ -42,7 +42,7 @@ wonderful package to use for working with character data is called
 `{stringr}`, which we'll use to extract the user ID from the
 `file_path_id` column.
 
-The main driver behind the functions in stringr are [regular
+The main driver behind the functions in `{stringr}` are [regular
 expressions](https://en.wikipedia.org/wiki/Regular_expression) (or regex
 for short). These expressions are powerful, very concise ways of finding
 patterns in text. Because they are so concise, though, they are also
@@ -121,25 +121,34 @@ another number from 0 to 9.
 
 Now that we've identified a possible regex to use to extract the user
 ID, let's test it out on the `user_info_df` data. Once it works, we will
-convert it into a function and move it into the `R/functions.R` file.
+convert it into a function and move (**cut and paste**) it into the
+`R/functions.R` file.
 
 Since we will create a new column for the user ID, we will use the
 `mutate()` function from the `{dplyr}` package. We'll use the
 `str_extract()` function from the `{stringr}` package to "extract a
 string" by using the regex `user_[1-9][0-9]?` that we discussed from the
-exercise. We're also using an argument to `mutate()` you might not have
-seen previously, called `.before`. This will insert the new `user_id`
-column before the column we use and we do this entirely for visual
-reasons, since it is easier to see the newly created column when we run
-the code. In your `doc/learning.qmd` file, create a new header called
+exercise. Since we're going to use `{stringr}`, so let's add it as a
+package dependency:
+
+```{r}
+#| eval: false
+usethis::use_package("stringr")
+```
+
+We're also using an argument to `mutate()` you might not have seen
+previously, called `.before`. This will insert the new `user_id` column
+before the column we use and we do this entirely for visual reasons,
+since it is easier to see the newly created column when we run the code.
+In your `doc/learning.qmd` file, create a new header called
 `## Using regex for user ID` at the bottom of the document, and create a
 new code chunk below that.
 
 ::: {.callout-note appearance="minimal" collapse="true"}
 ## Instructor note
 
 Walk through writing this code, briefly explain/remind how to use
-mutate, and about the stringr function.
+mutate, and about the `{stringr}` function.
 :::
 
 ```{r extract-user-id}
@@ -194,11 +203,12 @@ and extracts the user ID from it. **First step**: While in the
 same process you've done previously.
 
 1.  Call the new function `extract_user_id` and add one argument called
-    `imported_data`. - Remember to output the code into an object and
-    `return()` it at the end of the function. - Include Roxygen
-    documentation.
-2.  After writing it and testing that the function works, move the
-    function into `R/functions.R`.
+    `imported_data`.
+    -   Remember to output the code into an object and `return()` it at
+        the end of the function.
+    -   Include Roxygen documentation.
+2.  After writing it and testing that the function works, move (**cut**
+    and paste) the function into `R/functions.R`.
 3.  Run `{styler}` while in the `R/functions.R` file with
     {{< var keybind.styler >}}.
 4.  Replace the code in the `doc/learning.qmd` file with the function
@@ -272,8 +282,8 @@ file path variable, we need to actually use it within our processing
 pipeline. Since we want this function to work on all the datasets that
 we will import, we need to add it to the `import_multiple_files()`
 function. We'll go to the `import_multiple_files()` function in
-`R/functions.R` and use the `%>%` to add it after using the `map_dfr()`
-function. The code should look something like:
+`R/functions.R` and use the `%>%` to add it after using the
+`list_rbind()` function. The code should look something like:
 
 ```{r add-extract-user-to-import}
 import_multiple_files <- function(file_pattern, import_function) {
@@ -282,9 +292,8 @@ import_multiple_files <- function(file_pattern, import_function) {
     recurse = TRUE
   )
 
-  combined_data <- purrr::map_dfr(data_files, import_function,
-    .id = "file_path_id"
-  ) %>%
+  combined_data <- purrr::map(data_files, import_function) %>%
+    purrr::list_rbind(names_to = "file_path_id") %>%
     extract_user_id() # Add the function here.
   return(combined_data)
 }
@@ -310,18 +319,18 @@ to use `user_id` instead of `file_path_id`:
 summarised_rr_df <- rr_df %>%
   group_by(user_id, day) %>%
   summarise(across(ibi_s, list(
-    mean = ~ mean(.x, na.rm = TRUE), 
+    mean = ~ mean(.x, na.rm = TRUE),
     sd = ~ sd(.x, na.rm = TRUE)
-  ))) %>% 
+  ))) %>%
   ungroup()
 
 summarised_actigraph_df <- actigraph_df %>%
   group_by(user_id, day) %>%
   # These statistics will probably be different for you
   summarise(across(hr, list(
-    mean = ~ mean(.x, na.rm = TRUE), 
+    mean = ~ mean(.x, na.rm = TRUE),
     sd = ~ sd(.x, na.rm = TRUE)
-  ))) %>% 
+  ))) %>%
   ungroup()
 ```
 
@@ -439,16 +448,27 @@ columns and so can't be a vector itself), we need to combine the
 datasets together in a `list()` and reduce them with `full_join()`.
 :::
 
+Let's code this together, using `reduce()`, `full_join()`, and `list()`
+while in the `doc/learning.qmd` file.
+
 ```{r}
-combined_data <- reduce(list(user_info_df, saliva_df), full_join)
-combined_data
+list(
+  user_info_df,
+  saliva_df
+) %>%
+  reduce(full_join)
 ```
 
 We now have the data in a form that would make sense to join it with the
 other datasets. So lets try it:
 
 ```{r}
-reduce(list(user_info_df, saliva_df, summarised_rr_df), full_join)
+list(
+  user_info_df,
+  saliva_df,
+  summarised_rr_df
+) %>%
+  reduce(full_join)
 ```
 
 Hmm, but wait, we now have four rows of each user, when we should have
@@ -568,7 +588,13 @@ saliva_with_day_df
 ...Now, let's use the `reduce()` with `full_join()` again:
 
 ```{r}
-reduce(list(user_info_df, saliva_with_day_df, summarised_rr_df), full_join)
+list(
+  user_info_df,
+  saliva_df,
+  summarised_rr_df,
+  summarised_actigraph_df
+) %>%
+  reduce(full_join)
 ```
 
 We now have two rows per participant! Let's add and commit the changes
@@ -581,14 +607,13 @@ to bring it all together and put it into the `data-raw/mmash.R` script
 so we can create a final working dataset.
 
 Open up the `data-raw/mmash.R` file and the top of the file, add the
-`{vroom}` package to the end of the list of other packages. Move the
-code `library(fs)` to go with the other packages as well. It should look
-something like this now:
+`{tidyverse}` package to the end of the list of other packages if it
+isn't there already. Move the code `library(fs)` to go with the other
+packages as well. It should look something like this now:
 
 ```{r}
 library(here)
 library(tidyverse)
-library(vroom)
 library(fs)
 ```
 
@@ -631,15 +656,13 @@ saliva_with_day_df <- saliva_df %>%
     TRUE ~ NA_real_
   ))
 
-mmash <- reduce(
-  list(
-    user_info_df,
-    saliva_with_day_df,
-    summarised_rr_df,
-    summarised_actigraph_df
-  ),
-  full_join
-)
+mmash <- list(
+  user_info_df,
+  saliva_df,
+  summarised_rr_df,
+  summarised_actigraph_df
+) %>%
+  reduce(full_join)
 ```
 
 Lastly, we have to save this final dataset into the `data/` folder.