-
Notifications
You must be signed in to change notification settings - Fork 1
/
Rcode_lesson22.Rmd
147 lines (112 loc) · 4.46 KB
/
Rcode_lesson22.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
---
title: "Lesson 22 - Repeated Measures ANOVA"
author: "Melinda K. Higgins, PhD."
date: "November 12, 2017"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(message = TRUE)
knitr::opts_chunk$set(warning = TRUE)
```
## Load data and create subset
Using `tidyverse` and `haven` load the dataset and create a new subset for this lesson
```{r}
library(tidyverse)
library(haven)
helpdat <- haven::read_spss("helpmkh.sav")
h1 <- helpdat %>%
select(id, pcs, pcs1, pcs2, pcs3, pcs4)
```
## Running RM-ANOVA in R
There are a few helpful websites with information and examples on running repeated meausures analysis of variance (RM-ANOVA) in R. See these 2 websites:
* [https://gribblelab.wordpress.com/2009/03/09/repeated-measures-anova-using-r/](https://gribblelab.wordpress.com/2009/03/09/repeated-measures-anova-using-r/)
* [https://biostats.w.uib.no/test-for-sphericity-mauchly-test/](https://biostats.w.uib.no/test-for-sphericity-mauchly-test/)
The code I show below is based on these 2 examples - these use the `car` package.
```{r}
# extract only the 5 pcs columns
# for times 1-5
# and convert it to matrix format which is needed for lm()
h2 <- h1[,2:6]
h2 <- as.matrix(h2)
# use lm to get a model of 5 pcs columns
m1 <- lm(h2 ~ 1)
m1
# create a time factor for the design
tfactor <- factor(c("t0","t1","t2","t3","t4"))
library(car)
m1.aov <- car::Anova(m1,
idata = data.frame(tfactor),
idesign = ~tfactor,
type="III")
#summary(m1.aov, multivariate=FALSE)
# print complete results of the RM-ANOVA
summary(m1.aov, multivariate=TRUE)
```
## Plots of Means Over Time with Confidence Intervals
This next set of code comes from an example at the Cookbook for R website:
* see [http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/](http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/)
The code below creates a new function called `summarySE` for plotting the means +/- the SE (standard error)
```{r}
# restructure into long format
h1long <- h1 %>%
gather(key=item,
value=value,
-c(id))
names(h1long) <- c("id","pcsitem","pcsvalue")
# add a time variable to long format
h1long <- h1long %>%
mutate(time=c(rep(0,453),
rep(1,453),
rep(2,453),
rep(3,453),
rep(4,453)))
# from the cookbook for R
## Gives count, mean, standard deviation, standard error of the mean, and confidence interval (default 95%).
## data: a data frame.
## measurevar: the name of a column that contains the variable to be summariezed
## groupvars: a vector containing names of columns that contain grouping variables
## na.rm: a boolean that indicates whether to ignore NA's
## conf.interval: the percent range of the confidence interval (default is 95%)
summarySE <- function(data=NULL, measurevar, groupvars=NULL, na.rm=FALSE,
conf.interval=.95, .drop=TRUE) {
library(plyr)
# New version of length which can handle NA's: if na.rm==T, don't count them
length2 <- function (x, na.rm=FALSE) {
if (na.rm) sum(!is.na(x))
else length(x)
}
# This does the summary. For each group's data frame, return a vector with
# N, mean, and sd
datac <- ddply(data, groupvars, .drop=.drop,
.fun = function(xx, col) {
c(N = length2(xx[[col]], na.rm=na.rm),
mean = mean (xx[[col]], na.rm=na.rm),
sd = sd (xx[[col]], na.rm=na.rm)
)
},
measurevar
)
# Rename the "mean" column
datac <- rename(datac, c("mean" = measurevar))
datac$se <- datac$sd / sqrt(datac$N) # Calculate standard error of the mean
# Confidence interval multiplier for standard error
# Calculate t-statistic for confidence interval:
# e.g., if conf.interval is .95, use .975 (above/below), and use df=N-1
ciMult <- qt(conf.interval/2 + .5, datac$N-1)
datac$ci <- datac$se * ciMult
return(datac)
}
h1long_nomiss <- na.omit(h1long)
table(h1long_nomiss$time)
h1se <- summarySE(h1long_nomiss,
measurevar="pcsvalue",
groupvars=c("time"))
ggplot(h1se, aes(x=time, y=pcsvalue)) +
geom_errorbar(aes(ymin=pcsvalue-se, ymax=pcsvalue+se), width=.1) +
geom_line() +
geom_point() +
xlab("Time Points") +
ylab("Physical Component Score (SF-36 PCS)") +
ggtitle("PCS Means and CI's Over Time")
```