-
Notifications
You must be signed in to change notification settings - Fork 0
/
Homework 6(1).Rmd
104 lines (66 loc) · 5.27 KB
/
Homework 6(1).Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
---
title: "Homework 6"
output:
word_document: default
html_document: default
---
#Biostatistics Assignment for Biostatistics , Week 6
Download and open the “Blood Over Time” data from the class webpage. Answer the following questions referring to this dataset.
Same set-up as last week:
A scientist was worried that the quality of the blood they were using for ongoing experiments was degrading with longer storage times. They obtained blood from 40 HIV + volunteers and divided each blood sample up into four parts. They then measured the percent of CD4 and CD8 cells that were expressing CD127 from each sample at 4 hours, 24 hours, 48 hours and 72 hours.
For the following, we will only use 4 and 72 hour CD8 data.
1. Last time, we tested for a difference between 4 and 72 hour CD8 data by creating a new variable that summarized the difference for each person. You can do the same test using the “paired=T” choice, without creating this variable. Compare the 4 hour and 72 hour CD8 using a paired design.
```{r}
library(readr)
blood_CD8 <- read_csv("~/Papers/Biostatistics JHU 2021/blood_CD8.csv")
t.test(blood_CD8$`4`, blood_CD8$`72`, paired = T)
```
For the following, we will only use 4 hour CD4 data, but will also need to know IL-2 status.
2. Some of the individuals in the sample had taken IL-2 in the recent past when their blood was drawn; others had not. Do a t-test to answer the question as to whether there is a difference in the CD4% between people who have and have not taken IL-2, based on the 4 hour data. Write a one-sentence conclusion (including the p-value) of the results of the test, the way you might include it in a science paper. Do the test:
a. Assuming equal variances.
```{r}
library(readr)
Blood <- read_csv("~/Papers/Biostatistics JHU 2021/Blood over time(1).csv")
t.test(Blood$CD4.xCD127[Blood$Hours.post.draw==4]~Blood$IL2[Blood$Hours.post.draw==4], var.equal=T)
```
b. Assuming unequal variances.
```{r}
t.test(Blood$CD4.xCD127[Blood$Hours.post.draw==4]~Blood$IL2[Blood$Hours.post.draw==4])
```
*c. (FYI only, not due yet) The researcher comes back to you a few weeks later and wants to design two new trials to address the same questions, with higher statistical power. How many subjects to answer the following question: Are there differences between people with/without a history of IL2? Assume that CD4% in both groups will be normally distributed with a standard deviation of 8%. How many people are needed PER GROUP in order to have 80% power to detect a difference of 5% or greater? (on your own)*
```{r}
```
Chapter 11, Problem 5
A crossover study was conducted to investigate whether oat bran cereal helps to lower cholesterol levels in hypercholesterolemic males. Fourteen such individuals were randomly placed on a diet that included either oat bran or corn flakes; after two weeks, their low-density lipoprotein (LDL) cholesterol levels were recorded. Each man was then switched to the alternative diet. After a second two-week period, the LDL cholesterol level of each individual was again recorded. The data from this study are shown below:
| subject | corn flakes LDL | oat bran LDL | Difference |
|--------|----------------|---------------|------------|
| 1 | 4.61 | 3.84 | |
| 2 | 6.42 | 5.57 | |
| 3 | 5.40 | 5.85 | |
| 4 | 4.54 | 4.80 | |
| 5 | 3.98 | 3.68 | |
| 6 | 3.82 | 2.96 | |
| 7 | 5.01 | 4.41 | |
| 8 | 4.34 | 3.72 | |
| 9 | 3.80 | 3.49 | |
| 10 | 4.56 | 3.84 | |
| 11 | 5.35 | 5.26 | |
| 12 | 3.89 | 3.73 | |
| 13 | 2.25 | 1.84 | |
| 14 | 4.24 | 4.14 | |
a. Are the two samples paired or independent?
*The data is paired because it was conducted on the same 14 people for each cereal.*
b. What are the appropriate null and alternative hypotheses for a two-sided test
*H0:μd=0 against H1:μd≠0 (two-tailed)*
c. Conduct the test at the 0.05 level of significance. What is the p-value?
*The p-value is .0053, so we must reject the null hypothesis.*
d. What do you conclude?
*Overall, between the 2 tests, the Oat Bran cereal seemed to do better than corn flakes. LDL was lower in the oat bran cereal than it was in the corn flakes cereal.*
##Chapter 11, Problem 8
A study was conducted to determine whether an expectant mother's cigarette smoking has any effect on the bone mineral content of her otherwise healthy child. A sample of 77 newborns whose mothers smoked during pregnancy has mean bone mineral content $\bar x_1 = 0.098 g/cm$ and standard deviation $s_1 = 0.026 g/cm$; a sample of 161 infants whose mothers did not smoke has mean $\bar x_2 = 0.095 g/cm$ and standard deviation $s_2 = 0.025 g/cm$. Assume that the underlying population variances are equal.
a. Are the two samples paired or independent?
*The two samples are independent. Independent because the mothers who smoked were not specifically matched to mothers who did not smoke during pregnancy.*
b. State the null and alternative hypotheses of the two-sided test.
*The null and Alternative hypothesis of the two sided test are H0:μ1=μ2 against Ha:μ1≠μ2.*
c. Conduct the test at the 0.05 level of significance. What do you conclude?
*The p-value is 0.393 which is greater than the significance level of α=0.05, we fail to reject the null hypothesis.*