forked from acatlin/SPRING2021TIDYVERSE
-
Notifications
You must be signed in to change notification settings - Fork 0
/
hummell_dplyr.Rmd
62 lines (44 loc) · 1.42 KB
/
hummell_dplyr.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
---
title: "Tidyverse - Dplyr"
author: "Joshua Hummell"
date: "4/8/2021"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Tidyverse Vignette
```{r message=FALSE}
library(dplyr)
```
#### Hands down my favorite R package in Tidyverse is Dplyr
Dplyr allows for easy data manipulation and, therefore, is highly useful for everyday work!
```{r}
murders <- read.csv('https://raw.githubusercontent.com/fivethirtyeight/data/master/murder_2016/murder_2015_final.csv')
```
Select data columns with ease
```{r}
murders %>% select(state)
```
easily filter data
```{r}
murders %>%
filter(city == 'Baltimore')
```
Easily Aggregate Date
```{r}
state <- murders %>% select(state, change) %>% group_by(state) %>% summarise(state_totals = sum(change)) %>% arrange(desc(state_totals))
state
```
and even join data
```{r}
states_pop <- read.csv('https://raw.githubusercontent.com/jhumms/DATA607/main/state_populations.csv')
colnames(states_pop) <- tolower(colnames(states_pop))
murders_state <- left_join(state, states_pop, by='state')
murders_state
```
and, if that weren't enough, you can even make aggregations across columns very easily!
```{r}
murders_state$population <- as.numeric(murders_state$population)
murders_state %>% mutate(murder_rate_by_pop = (state_totals / population) *100) %>% arrange(desc(murder_rate_by_pop))
```