From b496c68896bfd54a5f43fb48c0420b179b457495 Mon Sep 17 00:00:00 2001 From: Jason Williams Date: Wed, 10 Apr 2024 12:17:45 -0400 Subject: [PATCH 1/5] added challenges and explanations on claases and modes for #55 --- episodes/01-r-basics.Rmd | 81 +++++++++++++++++++++-- episodes/03-basics-factors-dataframes.Rmd | 49 ++++++++++++-- 2 files changed, 117 insertions(+), 13 deletions(-) diff --git a/episodes/01-r-basics.Rmd b/episodes/01-r-basics.Rmd index 73fb850d..3336219f 100644 --- a/episodes/01-r-basics.Rmd +++ b/episodes/01-r-basics.Rmd @@ -252,16 +252,42 @@ longer exists. Error: object 'gene_name' not found ``` -## Understanding object data types (modes) +## Understanding object data types (classes and modes) -In R, **every object has two properties**: +In R, **every objects have several properties**: - **Length**: How many distinct values are held in that object - **Mode**: What is the classification (type) of that object. +- **Class**: A property assigned to an object that determines how a function + will operate on it. We will get to the "length" property later in the lesson. The **"mode" property** -**corresponds to the type of data an object represents**. The most common modes -you will encounter in R are: +**corresponds to the type of data an object represents**. and the **"class" property determines how functions will work with that object.** + + +::::::::::::::::::::::::::::::::::::::::: callout + +## Tip: Classess vs. modes + +The difference between modes and classess is a bit **confusing** and the subject of +several [online discussions](https://stackoverflow.com/questions/35445112/what-is-the-difference-between-mode-and-class-in-r). +Often, these terms are used interchangeably. Do you really need to know +the difference? + +Well, perhaps. This section is important for you to have a better understanding +of how R works and how to write usable code. However, you might not come across +a situation where the difference is crucial while you are taking your first steps +in learning R. However, the overarching concept—**that objects in R have these properties and that you can use functions to check or change them**—is very important! + +In this lesson we will mostly stick to **mode** but we will throw in a few +examples of the `class()` and `typeof()` so you can see some examples of where +it may make a difference. + +:::::::::::::::::::::::::::::::::::::::::::::::::: + + + +The most common modes you will encounter in R are: | Mode (abbreviation) | Type of data | | ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | @@ -276,9 +302,9 @@ Data types are familiar in many programming languages, but also in natural language where we refer to them as the parts of speech, e.g. nouns, verbs, adverbs, etc. Once you know if a word - perhaps an unfamiliar one - is a noun, you can probably guess you can count it and make it plural if there is more than -one (e.g. 1 [Tuatara](https://en.wikipedia.org/wiki/Tuatara), or 2 Tuataras). If +one (e.g, 1 [Tuatara](https://en.wikipedia.org/wiki/Tuatara), or 2 Tuataras). If something is a adjective, you can usually change it into an adverb by adding -"-ly" (e.g. [jejune](https://www.merriam-webster.com/dictionary/jejune) vs. +"-ly" (e.g., [jejune](https://www.merriam-webster.com/dictionary/jejune) vs. jejunely). Depending on the context, you may need to decide if a word is in one category or another (e.g "cut" may be a noun when it's on your finger, or a verb when you are preparing vegetables). These concepts have important analogies when @@ -325,6 +351,44 @@ mode(pilot) :::::::::::::::::::::::::::::::::::::::::::::::::: +::::::::::::::::::::::::::::::::::::::: challenge + + +## Exercise: Create objects and check their class using "class" + +Using the objects created in the previous challenge, use the `class()` function +to check their classes. + +::::::::::::::: solution + +## Solution + +```{r, echo=FALSE, purl=FALSE} +chromosome_name <- 'chr02' +od_600_value <- 0.47 +chr_position <- '1001701' +spock <- TRUE + +``` + + +```{r, purl=FALSE} +class(chromosome_name) +class(od_600_value) +class(chr_position) +class(spock) +``` + +```{r, purl=FALSE} +class(pilot) +``` + +::::::::::::::::::::::::: + +:::::::::::::::::::::::::::::::::::::::::::::::::: + +Notice that in the two challenges, `mode()` and `class()` return the same results. This time. + Notice from the solution that even if a series of numbers is given as a value R will consider them to be in the "character" mode if they are enclosed as single or double quotes. Also, notice that you cannot take a string of alphanumeric @@ -340,6 +404,11 @@ pilot <- "Earhart" mode(pilot) ``` +```{r, purl=FALSE} +pilot <- "Earhart" +typeof(pilot) +``` + ## Mathematical and functional operations on objects Once an object exists (which by definition also means it has a mode), R can diff --git a/episodes/03-basics-factors-dataframes.Rmd b/episodes/03-basics-factors-dataframes.Rmd index 77d5de90..fea4cec7 100644 --- a/episodes/03-basics-factors-dataframes.Rmd +++ b/episodes/03-basics-factors-dataframes.Rmd @@ -146,11 +146,11 @@ frequently, you may be surprised at what you find they can do. ## Tip: Why does ?read.csv open the documentations to read.table? -The reason for this is because `read.csv` is actually a short cut -for `read.table("file.csv", sep = ",")`. You can see in the help -documentation that there are several additional variations of -`read.table`, such as `read.csv2` to read tables separated by `;` -and `read.delim` to read in tables separated by `\t` (tabs). If you know how your table is separated, you can use one of the provided short cuts, +The reason for this is because `read.csv` is actually a short cut +for `read.table("file.csv", sep = ",")`. You can see in the help +documentation that there are several additional variations of +`read.table`, such as `read.csv2` to read tables separated by `;` +and `read.delim` to read in tables separated by `\t` (tabs). If you know how your table is separated, you can use one of the provided short cuts, but case you run into an unconventional separator you are now equipt with the knowledge to define it in the `sep = ` arugument of `read.table`! @@ -234,6 +234,43 @@ Ok, thats a lot up unpack! Some things to notice. by the object mode (e.g. chr, int, etc.). Notice that before each variable name there is a `$` - this will be important later. + + + ::::::::::::::::::::::::::::::::::::::: challenge + + ## Exercise: Revisiting modes and classess + + Remeber when we said mode and class are sometimes different? If you do, here + is a chance to check. What happens when you try the following? + + 1. `mode(variants)` + 2. `class(variants)` + + ::::::::::::::: solution + + ## Solution + + + + ```{r, purl=FALSE} + mode(variants) + ``` + + + + ```{r, purl=FALSE} + class(variants) + ``` + + This result makes sense because mode (which deals with how an object is stored) + is treated as a **list** in R. A data frame is in some sense a "fancy" list. + However, data fames do have some specific properties so they have their own + class (**data.frame**) which is useful for functions (and programmers) to know. + ::::::::::::::::::::::::: + + :::::::::::::::::::::::::::::::::::::::::::::::::: + + ## Introducing Factors Factors are the final major data structure we will introduce in our R genomics @@ -860,5 +897,3 @@ write.csv(Ecoli_metadata, file = "exercise_solution.csv") - Base R has many useful functions for manipulating your data, but all of R's capabilities are greatly enhanced by software packages developed by the community :::::::::::::::::::::::::::::::::::::::::::::::::: - - From 08dffbccec4905dc835070043c180b83cce0fd29 Mon Sep 17 00:00:00 2001 From: Naupaka Zimmerman Date: Wed, 10 Apr 2024 09:27:19 -0700 Subject: [PATCH 2/5] Update 01-r-basics.Rmd Fix typos --- episodes/01-r-basics.Rmd | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/episodes/01-r-basics.Rmd b/episodes/01-r-basics.Rmd index e0e2905c..b5976195 100644 --- a/episodes/01-r-basics.Rmd +++ b/episodes/01-r-basics.Rmd @@ -254,7 +254,7 @@ Error: object 'gene_name' not found ## Understanding object data types (classes and modes) -In R, **every objects have several properties**: +In R, **every object has several properties**: - **Length**: How many distinct values are held in that object - **Mode**: What is the classification (type) of that object. @@ -262,7 +262,7 @@ In R, **every objects have several properties**: will operate on it. We will get to the "length" property later in the lesson. The **"mode" property** -**corresponds to the type of data an object represents**. and the **"class" property determines how functions will work with that object.** +**corresponds to the type of data an object represents** and the **"class" property determines how functions will work with that object.** ::::::::::::::::::::::::::::::::::::::::: callout From 7dd494fb0cfdcaaf6bf87b43f863926608aa0530 Mon Sep 17 00:00:00 2001 From: Naupaka Zimmerman Date: Wed, 10 Apr 2024 09:31:14 -0700 Subject: [PATCH 3/5] Update 01-r-basics.Rmd Fix typos --- episodes/01-r-basics.Rmd | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/episodes/01-r-basics.Rmd b/episodes/01-r-basics.Rmd index b5976195..1f7c9378 100644 --- a/episodes/01-r-basics.Rmd +++ b/episodes/01-r-basics.Rmd @@ -269,7 +269,7 @@ We will get to the "length" property later in the lesson. The **"mode" property* ## Tip: Classess vs. modes -The difference between modes and classess is a bit **confusing** and the subject of +The difference between modes and classes is a bit **confusing** and the subject of several [online discussions](https://stackoverflow.com/questions/35445112/what-is-the-difference-between-mode-and-class-in-r). Often, these terms are used interchangeably. Do you really need to know the difference? @@ -302,7 +302,7 @@ Data types are familiar in many programming languages, but also in natural language where we refer to them as the parts of speech, e.g. nouns, verbs, adverbs, etc. Once you know if a word - perhaps an unfamiliar one - is a noun, you can probably guess you can count it and make it plural if there is more than -one (e.g, 1 [Tuatara](https://en.wikipedia.org/wiki/Tuatara), or 2 Tuataras). If +one (e.g., 1 [Tuatara](https://en.wikipedia.org/wiki/Tuatara), or 2 Tuataras). If something is a adjective, you can usually change it into an adverb by adding "-ly" (e.g., [jejune](https://www.merriam-webster.com/dictionary/jejune) vs. jejunely). Depending on the context, you may need to decide if a word is in one @@ -387,7 +387,7 @@ class(pilot) :::::::::::::::::::::::::::::::::::::::::::::::::: -Notice that in the two challenges, `mode()` and `class()` return the same results. This time. +Notice that in the two challenges, `mode()` and `class()` return the same results. This time... Notice from the solution that even if a series of numbers is given as a value R will consider them to be in the "character" mode if they are enclosed as From 3092eefea4ab4ddbd915a236e0ed55edcd29e6bd Mon Sep 17 00:00:00 2001 From: Naupaka Zimmerman Date: Wed, 10 Apr 2024 09:36:12 -0700 Subject: [PATCH 4/5] Update 03-basics-factors-dataframes.Rmd Adjust text to clarify meaning. --- episodes/03-basics-factors-dataframes.Rmd | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/episodes/03-basics-factors-dataframes.Rmd b/episodes/03-basics-factors-dataframes.Rmd index 84f0ec56..3a54d1d0 100644 --- a/episodes/03-basics-factors-dataframes.Rmd +++ b/episodes/03-basics-factors-dataframes.Rmd @@ -262,10 +262,10 @@ Ok, thats a lot up unpack! Some things to notice. class(variants) ``` - This result makes sense because mode (which deals with how an object is stored) - is treated as a **list** in R. A data frame is in some sense a "fancy" list. - However, data fames do have some specific properties so they have their own - class (**data.frame**) which is useful for functions (and programmers) to know. + This result makes sense because `mode()` (which deals with how an object is stored) + tells us that `variants` treated as a **list** in R. A data frame is in some sense a "fancy" list. + However, data fames do have some specific properties beyond that of a basic list, so they have their own + class (**data.frame**), which is important for functions (and programmers) to know. ::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::::::::::::: From 503ab554c613ab6f8cf3091f9ff96dc76671b81f Mon Sep 17 00:00:00 2001 From: Naupaka Zimmerman Date: Wed, 10 Apr 2024 09:36:43 -0700 Subject: [PATCH 5/5] Update 03-basics-factors-dataframes.Rmd Fix typo --- episodes/03-basics-factors-dataframes.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/episodes/03-basics-factors-dataframes.Rmd b/episodes/03-basics-factors-dataframes.Rmd index 3a54d1d0..dff912a6 100644 --- a/episodes/03-basics-factors-dataframes.Rmd +++ b/episodes/03-basics-factors-dataframes.Rmd @@ -263,7 +263,7 @@ Ok, thats a lot up unpack! Some things to notice. ``` This result makes sense because `mode()` (which deals with how an object is stored) - tells us that `variants` treated as a **list** in R. A data frame is in some sense a "fancy" list. + tells us that `variants` is treated as a **list** in R. A data frame is in some sense a "fancy" list. However, data fames do have some specific properties beyond that of a basic list, so they have their own class (**data.frame**), which is important for functions (and programmers) to know. :::::::::::::::::::::::::