tidyverse_create_covidpolls.Rmd

---
title: "Tidyverse Extend Assignment"
author: "Claire Meyer"
date: "3/28/2021"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

## Summary

This document will take some aggregated polling data from FiveThirtyEight, use TidyR to clean it up, and ggplot to do some lightweight plotting.

Details on data found [here](https://github.com/fivethirtyeight/covid-19-polls).

### Loading Data and filtering with Dplyr

To start, we'll load the Tidyverse library, download the polling_data, and do some clean-up. I'll use dplyr's filter to filter for polls of the voting population (population == 'rv').

```{r load}
library(tidyverse)
polling_data <- read.csv("http://raw.githubusercontent.com/fivethirtyeight/covid-19-polls/master/covid_approval_polls.csv", header=TRUE) %>%
  filter(population == 'rv')
```

### Changing the shape of the data with TidyR

I want to use Tidyr to change the shape of this data. I want to make this data wider. Right now it has at least 4 rows per poll, with each party (R, D, I) as well as an 'all' aggregated category. 

```{r make-wider}
polling_data_wide <- polling_data %>%
  pivot_wider(names_from = 'party',values_from = c('approve','disapprove','sample_size'))
```

### Plotting with ggplot

I want to make a quick plot of 'R' approval ratings of Trump and Biden.

```{r plotting}
ggplot(polling_data_wide, aes(x=approve_R))+
  geom_histogram()+ facet_grid(subject ~ .)+
  labs(title="Approval Ratings for Biden and Trump")
```
### Summary of Polls

Get the number of pollster records in the dataset. where subject is trump

```{r echo=TRUE, eval=TRUE}
poll_data_all <- read_csv("http://raw.githubusercontent.com/fivethirtyeight/covid-19-polls/master/covid_approval_polls.csv")

selected_polls <- select(poll_data_all, c("pollster", "sponsor", "party", "subject"))

trump_filtered_polls <-filter(selected_polls, subject == "Trump")

selected_poll_count <- trump_filtered_polls %>% count(pollster, name = "Trump_Count", sort = TRUE)
head(selected_poll_count)
```