From 6681fe04dd1d792dff36522ba348f34c2cf63e1b Mon Sep 17 00:00:00 2001
From: Maximilian Frank <48761677+XAM12@users.noreply.github.com>
Date: Fri, 29 Nov 2024 00:55:58 +0100
Subject: [PATCH] Update 03-dplyr.Rmd

proposed solution to: https://github.com/datacarpentry/r-socialsci/issues/527

moved the line about the advantage of tbl_df to the example where the first tibble based output is created as it is not related to the summarize function.
added an explanation on how to identify grouped vs ungrouped tibbles in the output
---
 episodes/03-dplyr.Rmd | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/episodes/03-dplyr.Rmd b/episodes/03-dplyr.Rmd
index bd9394e8..02c448d2 100644
--- a/episodes/03-dplyr.Rmd
+++ b/episodes/03-dplyr.Rmd
@@ -147,6 +147,10 @@ dataframe to adhere to (e.g. village name is Chirodzo):
 filter(interviews, village == "Chirodzo")
 ```
 
+You may also have noticed that the output from these call doesn't run off the
+screen anymore. It's one of the advantages of `tbl_df` (also called tibble), 
+the central data class in the tidyverse, compared to normal dataframes in R.
+
 We can also specify multiple conditions within the `filter()` function. We can
 combine conditions using either "and" or "or" statements. In an "and"
 statement, an observation (row) must meet **every** criteria to be included
@@ -365,9 +369,6 @@ interviews %>%
     summarize(mean_no_membrs = mean(no_membrs))
 ```
 
-You may also have noticed that the output from these calls doesn't run off the
-screen anymore. It's one of the advantages of `tbl_df` over dataframe.
-
 You can also group by multiple columns:
 
 ```{r, purl=FALSE}
@@ -376,7 +377,9 @@ interviews %>%
     summarize(mean_no_membrs = mean(no_membrs))
 ```
 
-Note that the output is a grouped tibble. To obtain an ungrouped tibble, use the
+Note that the output is a grouped tibble of nine rows by three columns 
+which is indicated by the by two first lines with the `#`.
+To obtain an ungrouped tibble, use the
 `ungroup` function:
 
 ```{r, purl=FALSE}
@@ -386,6 +389,8 @@ interviews %>%
     ungroup()
 ```
 
+Notice that the second line with the `#` that previously indicated the grouping has 
+disappeared and we now only have a 9x3-tibble without grouping.
 When grouping both by `village` and `membr_assoc`, we see rows in our table for
 respondents who did not specify whether they were a member of an irrigation
 association. We can exclude those data from our table using a filter step.