-
Notifications
You must be signed in to change notification settings - Fork 21
/
Ex2.Rmd
51 lines (38 loc) · 2.76 KB
/
Ex2.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
---
title: 'Statistics 692, exercises #2'
author: "Douglas Bates"
date: "2014-10-14"
output:
pdf_document:
keep_tex: true
fig_caption: true
---
```{r initial,echo=FALSE,print=FALSE,cache=FALSE}
library(knitr)
library(ggplot2)
classroom <- read.csv("http://www-personal.umich.edu/~bwest/classroom.csv")
opts_chunk$set(cache=TRUE)
options(width=76, show.signif.stars = FALSE, str = strOptions(strict.width = "cut"))
set.seed(123454321)
```
Your answers should be submitted as an _R Markdown_ (`Rmd`) document providing both the code you use and a brief description of each step. The description should be written as if intended for the "client".
Submit only the `.Rmd` file. The file will be run through `knitr::knit` to verify that it produces the desired results. Use `pdf_document` as the output format. Allow them to float to the top or the bottom of the page. Provide figure captions.
Load the `classroom` data frame that you saved in the previous set of exercises and attach the `ggplot2` package.
1. Provide a histogram of the `mathkind` (mathematics score in kindergarten) variable. For this and all other plots in this assignment provide informative axis labels where appropriate. An example of this plot is shown in Figure (#fig:hist1)
```{r hist1,echo=FALSE,fig.cap="Histogram of the kindergarten mathematics score",fig.height=3,fig.width=6.5,fig.pos="tb"}
p <- ggplot(classroom,aes(x=mathkind)) + xlab("Kindergarten Mathematics Score")
p + geom_histogram()
```
2. Provide an empirical density plot of `mathkind`, as shown in Figure (#fig:dens1)
```{r dens1,echo=FALSE,fig.cap="Empirical density plot of the kindergarten mathematics score",fig.height=2.5,fig.width=6.5,fig.pos="tb"}
p + geom_density()
```
3. Provide an empirical density plot of `mathkind` by `sex`. Do this in two ways: One panel with two lines and two panels.
4. Provide an empirical density plot of `mathkind` by `minority`.
5. Provide a scatter-plot of `mathgain` versus the kindergarten score. Include a scatterplot smoother curve with the band of pointwise confidence intervals.
a. The negative correlation between `mathgain` and `mathkind` is not surprising. Explain.
b. Does it make sense to include `mathkind` as a covariate in a model for `mathgain`? Explain.
6. Repeat this plot with a simple linear regression line instead of a scatterplot smoother.
a. Include the estimated coefficients of the regression line in the caption for the plot
7. Create a new variable called `mathone` which is the sum of `mathkind` and `mathgain`. Repeat the plot with the scatterplot smoother for `mathone` versus `mathkind`. Because both variables are measured on the same scale use a unit aspect ratio for the plot.
8. Create a dotplot of the mathgain by classroom for students in school 11 only.