datacarpentry · juanfung · Nov 29, 2024 · Nov 28, 2024
diff --git a/episodes/03-dplyr.Rmd b/episodes/03-dplyr.Rmd
@@ -147,6 +147,10 @@ dataframe to adhere to (e.g. village name is Chirodzo):
 filter(interviews, village == "Chirodzo")
 ```
 
+You may also have noticed that the output from these call doesn't run off the
+screen anymore. It's one of the advantages of `tbl_df` (also called tibble), 
+the central data class in the tidyverse, compared to normal dataframes in R.
+
 We can also specify multiple conditions within the `filter()` function. We can
 combine conditions using either "and" or "or" statements. In an "and"
 statement, an observation (row) must meet **every** criteria to be included
@@ -365,9 +369,6 @@ interviews %>%
     summarize(mean_no_membrs = mean(no_membrs))
 ```
 
-You may also have noticed that the output from these calls doesn't run off the
-screen anymore. It's one of the advantages of `tbl_df` over dataframe.
-
 You can also group by multiple columns:
 
 ```{r, purl=FALSE}
@@ -376,7 +377,9 @@ interviews %>%
     summarize(mean_no_membrs = mean(no_membrs))
 ```
 
-Note that the output is a grouped tibble. To obtain an ungrouped tibble, use the
+Note that the output is a grouped tibble of nine rows by three columns 
+which is indicated by the by two first lines with the `#`.
+To obtain an ungrouped tibble, use the
 `ungroup` function:
 
 ```{r, purl=FALSE}
@@ -386,6 +389,8 @@ interviews %>%
     ungroup()
 ```
 
+Notice that the second line with the `#` that previously indicated the grouping has 
+disappeared and we now only have a 9x3-tibble without grouping.
 When grouping both by `village` and `membr_assoc`, we see rows in our table for
 respondents who did not specify whether they were a member of an irrigation
 association. We can exclude those data from our table using a filter step.