-
Notifications
You must be signed in to change notification settings - Fork 0
/
STAT230-StudyNotes.tex
318 lines (250 loc) · 16.3 KB
/
STAT230-StudyNotes.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
\documentclass[11pt]{article}
\newcommand{\HRule}{\rule{\linewidth}{0.5mm}}
\author{Daniel Chevalier}
\title{STAT 230 - Study Notes}
\begin{document}
\begin{titlepage}
\begin{center}
\textsc{\LARGE STAT230 - Study Notes}\\
{\Large An open source note project}\\
\HRule\\
{\large \bf Original Author:\\ Daniel Chevalier (AKA: TopGunCoder)}
\HRule
This is an open source study note that was made to help with the understanding of STAT230 content.\\
{\large \bf To access the repository for this study note:}\\
https://github.com/TopGunCoder/STAT230-StudyAide\\
The more people who contribute and pass this on, the more that yourself and future STAT230 students can benefit from this.\\
Pass it forward! :)\\
\vfill
NOTE: These notes are taken directly from: \\
\emph{STAT 220/230 NOTES (2011-12 Edition)\\
originally by Chris Springer\\
edited by Jerry Lawless and Don Mcleish}
\end{center}
\end{titlepage}
\newpage
%VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
\section*{Key Terms}
\subsection*{8.2: Joint Probability Function:}
$f(x_1,...,x_k)=\frac{n!}{x_1!...x_k!}p_1^{x_1}p_2^{x_2} ...p_k^{x_k}$
\subsection*{8.3: The distribution of $X_n$}
Suppose that the chain is started by randomly choosing a state for $X_0$ with distrubution $P[X_0=i]=q_i,i=1,2,....,N$. Then the distrubution of $X_1$ is guben by:\\
$P(X_1=j)=\sum_{i=1}^{N}P(X_1=j,X_0=i)$\\
$=\sum^N_{i=1}P(X_1=j|X_0=i)P(X_0=i)$\\
$\sum_{i=1}^N Pijqiy$
\subsection*{7.2: Linearity Properties of Expectation}
\begin{enumerate}
\item For constants $a$ and $b$ \\
$E[ag(x)+b]=aE[g(x)]+b$
\item Similarly for constants $a$ and $b$ and two functions $g_1$ and $g_2$, it is also easy to show:\\
$E[ag_1(X)+bg_2(X)]=aE[g_1(X)]+bE[g_2(X)]$
\end{enumerate}
\subsection*{7.4: Properties of Mean and Variance}
If $a$ and $b$ are constants and $Y = aX+b$, then\\
$\mu_Y=a\mu_X + b$ and $\sigma^2_Y=a^2 \sigma^2_X$\\
(where $\mu_X$ and $\sigma^2_X$ are the mean and variabnce of $X$ and $\mu_Y$ and $\sigma^2_Y $ are the mean and variance of $Y$).
\subsection*{8.4: Property of Multivariate Expectation}
It is easily proved (make sure you can do this) that\\
$E[ag_1(X,Y)+bg_2(X,Y)]=aE[g_1(X,Y)]+bE[g_2(X,Y)]$\\
This can be extended beyond 2 functions $g_1$ and $g_2$, and beyond 2 variables $X$ and $Y$
\subsection*{8.5: Results for Means}
\begin{enumerate}
\item $E(aX+bY)=aE(X)+bE(Y) = a\mu_X+b\mu_Y$, when $a$ and $b$ are constants. (This follows from the definition of expected value .) In particular, $E(X+Y) = \mu_X +\mu_Y$ and $E(X-Y)=\mu_X-\mu_Y$
\item Let $a_i$ be constants (real numbers) and $E(x_i) = \mu_i$. Then $E(\sum a_iX_i)=\sum a_i\mu_i$. In particular, $E(\sum X_i)=\sum E(X_i)$.
\item Let $X_1,X_2,...,X_n$ be randome variables which have mean $\mu$. (You can imagine these being some sample results from and experiment such as recording the number of occupants in cars travelling over a toll bridge.) The sample mean is $\overline{X}=\frac{\sum^n_{i=1}X_i}{n}$. Then $E(\overline{X}) = \mu$
\end{enumerate}
\subsection*{8.5: Results for Covariance}
\begin{enumerate}
\item $Cov(X,X) = E[(X-\mu_X)(X-\mu_X)]=E[(X-\mu)^2]=Var(X)$
\item $Cov(aX+bY,cU+dV)=acCov(X,U)+adCov(X,V)+bcCov(Y,U)+bdCov(Y,V)$ where $a,b,c,$ and $d$ are constants.
\end{enumerate}
\subsection*{8.5: Results for Variance}
\begin{enumerate}
\item {\bf Variance of linear combination:}\\
$Var(aX+bY)=a^2Var(X)+b^2Var(Y)+2abCov(X,Y)$
\item {\bf Variance of a sum of independent random variables}\\
Let $X$ and $Y$ be independent. Since $Cov(X,Y)=0$, result $1$. gives\\
$Var(X+Y)=\sigma^2_X+\sigma^2_Y$;\\
i.e., for independent variables, the variance of a difference is the \underline{sum} of the variances.\\
\item {\bf Variance of a general linear combination:}\\
Let $a_i$ be constants and $Var(X_i)=\sigma^2_i$. Then \\
$Var(\sum a_i X_i)=\sum a_i^2 \sigma_i^2 +2\sum_{i<j} a_ia_j Cov(X_i,X_j) $.\\
This is a generalization of result 1. and can be proved using either of the methods used for 1.\\
\item {\bf Variance of a linear combination of independent}\\
Special cases of result 3. are:
\begin{list}{•}{•}
\item a) If $X_1,X_2,...,X_n$ are independent then $Cov(X_i,X_j)=0$, so that\\
$Var(\sum a_iX_i)=\sum a_i^2 \sigma_i^2$.
\item b) If $X_1,X_2,...,X_n$ are independent and all have the same variance $\sigma^2$, then\\
$Var(\overline{X})=\frac{\sigma^2}{n}$
\end{list}
\end{enumerate}
\subsection*{8.5: Indicator Variables}
The results for linear combinations of random variables provide a way of breaking up more complicated problems, involving mean and cariance, into simpler pieces using indicator variables; an indicator variable is just a binary variable (o or 1) that indicates whether or not some event occurs.
\subsection*{9.1: Continuous random variables}
{\bf Continuous random variables} have a range (set of possible values) and interval (or a collection of intervals) on the real number line. They have to be treated a little differently than discrete random variables because $P(X=x)$ is zero for each $x$.
\subsection*{9.1: Cumulative Distribution Function}
For discrete random variables we defined the c.d.f, $F(x) = P(X \leq x)$ for continuous random variables as well as for discrete.
\subsection*{9.1: Properties of a probability density function}
\begin{enumerate}
\item $P(a \leq X \leq b) = F(b)-F(a)= \int_a^b \! f(x)\; \mathrm{d}x$. (This follwos from the definition of $f(x)$)
\item $f(x) \geq 0$. (Since $F(x)$ is non-decreasing, its derivatice is non-negative)
\item $\int_{-\infty}^{\infty} \!f(x) \; \mathrm{d}x=\int_{all\;x} \! f(x) \; \mathrm{d}x = 1$. (this is because $P(-\infty \leq X\leq \infty)=1$)
\item $F(x) = \int_{-\infty}^{\infty} \! f(u) \; \mathrm{d}u$. (this is just property 1 with $a=-\infty$)
\end{enumerate}
\subsection*{9.1: Defined Variables or Change of Variable}
When we know the p.d.f or c.d.f. for a continuous random variable $X$ we sometimes want to find the p.d.f. or c.d.f for some other random variable $Y$ which is a function of $X$. The procedure for doing this is summarized below. It is based on the fact that the c.d.f $F_Y(y)$ for $Y$ equals $P(Y\leq y)$, and this can be rewritten in terms of $X$ since $Y$ is a function of $X$. Thus:\\
\begin{enumerate}
\item Write the c.d.f. of $Y$ as a function of $X$.
\item Use $F_X(x)$ to find $F_Y(y)$. Then if you want the p.d.f. $f_Y(y)$, you can differentiate the expression for $F_Y(y)$.
\item Find the range of values of $y$.
\end{enumerate}
\subsection*{9.2: The probability density function and the cumulative distribution function}
Since all points are equally likely (more precisely, intervals contained in $[a,b]$ of a given length, say 0.01, all have the same probability), the probability density function must be a constant $f(x)=l;a\leq x \leq b$ fro some constant $k$. To make $\int_a^b \! f(x) \; \mathrm{d}x = 1$, we require $k = \frac{1}{b-a}$.\\
Therefore $f(x)= \frac{1}{b-a}$ for $a \leq x \leq b$\\
$F(x) = \left\{
\begin{array}{lr}
0 & : for\; x< a\\
\int_a^x \! \frac{1}{b-a} \; \mathrm{d}x & : for\; a \leq x \leq b \\
1 & : for\; x > b
\end{array}
\right.$
\subsection*{9.2: Mean and Variance}
$\mu = \frac{b+a}{2}$\\
$E(X^2)=\frac{b^2+ab+a^2}{3}$\\
$\sigma^2 = \frac{(b-a)^2}{12}$
\subsection*{9.3: Exponential Distribution}
The continuous random variable $X$ is siad to have an {\bf exponential distribution} if its p.d.f. is of the form\\
$f(x) = \lambda e^{-\lambda x} \; \;\;\;\;\; x> 0$
\subsection*{9.3: p.d.f and c.d.f of Exponential Distribution}
$F(x)=1-\frac{\mu^0 e^{-\mu}}{0!} = 1-e^{-\mu}$.\\
$f(x)=\frac{\mathrm{d}}{\mathrm{d}x} F(x) = \lambda e^{-\lambda x}$; for $x>0$
\subsection*{9.3: The Memoryless Property of the Exponential Distribution}
$P(X>a+b|X>b)=P(X>a)$
\subsection*{9.5: Normal Distribution p.d.f}
$f(x)=\frac{1}{\sqrt{2\pi}\sigma}e^{\frac{1}{2}(\frac{x-\mu}{\sigma})^2}\; \; \; \; \; \; \; $ $\infty < x < \infty$\\
$E(x) = \mu$\\
$Var(X) = \sigma^2$\\
So it's p.d.f is written as\\
$X \sim N(\mu, \sigma^2)$
\subsection*{9.5: The Cumulative Distribution Function of the Normal Distribution}
The c.d.f. of the normal distribution $N(\mu,\sigma^2)$ is\\
$F(x)=\int^x_{-\infty} \! \frac{1}{\sqrt{2 \pi }\sigma}e^{-\frac{1}{2}(\frac{y-\mu}{\sigma})^2} \; \mathrm{d}y$
\subsection*{9.5: Mean,Variance, and Moment Generating Function of a normal distrubution}
$E(X)=\mu$\\
$Var(X)=\sigma^2$\\
$M_X(t) = E(e^{Xt}) = e^{\mu t+\sigma^2t^2/2}$
\subsection*{9.5: Gaussian Distribution}
The normal distribution is also known as the Gaussian distribution. The notation $X\sim G(\mu,\sigma)$ means that $X$ has Gaussian (normal) distribution with mean$\mu$ and standard deviation $\sigma$. So, for example, if $X\sim N(1,4)$ then we could also write $X\sim G(1,2)$.
\subsection*{9.5 Linear Combinations of Independent Normal Random Variables}
\begin{enumerate}
\item Let $X\sim N(\mu,\sigma^2)$ and $Y = aX+b$, where $a$ and $b$ are constant real numbers. Then $Y\sim N(a\mu + b,a^2\sigma^2)$
\item Let $X\sim N(\mu_1,\sigma^2_1)$ and $Y\sim N(\mu_2,\sigma^2_2)$ be independent, and let $a$ and $b$ be constants. \\
Then $aX+bY \sim N(a\mu+b\mu_2,a^2\sigma_1^2+b^2\sigma_2^2)$.\\
In general if $X_i \sim N(\mu_i,\sigma^2_i)$ are independent and $a_i$ are constants,\\
then $\sum a_iX_i \sim
N(\sum a_i\mu_i,\sum a_i^2\sigma_i^2)$.
\item let $X_1,...,X_n$ be independent $N(\mu,\sigma^2)$ random variables.\\
Then $\sum X_i\sim N(n \mu,n\sigma^2)$ and $\overline{X}\sim N(\mu,\frac{\sigma^2}{n})$.
\end{enumerate}
%VVVVVVVVVVVVVVV THEOREMS VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
\section*{Theorems}
\subsection*{7.2: Theorem 16}
Suppose the random variable $X$ has probability function $f(x)$. Then the {\bf expected vaue} of some function $g(x)$ of $X$ is given by\\
$E[g(x)]=\sum_{all x}g(x)f(x)$
\subsection*{7.5: Theorem 20}
Let the random variable $X$ gave m.g.f. $M(t)$. Then\\
$ E(X^r)=M^{(r)}(0)$ $r=1,2,...$\\
where $M^{(r)}(0)$ stands for $d^rM(t)/dt^r$ evaluated at t=0\\
\subsection*{8.4: Theorem 27}
If X and Y are independent then\\
$Cov(X,Y)=0$\\
\subsection*{8.4: Theorem 28}
Suppose random variables and $Y$ are independent. Then, if $g_1(X)$ and $g_2(Y)$ are any two functions,\\
$E[g_1(X)g_2(Y)] = E[g_1(X)]E(g_2(Y)]$
\subsection*{8.6: Theorem 31}
The {\bf moment generating function of the sum of independent random variables} is the product of the individual moment generating functions.
\subsection*{9.4: Theorem 35}
If $F$ is an arbitrary c.d.f. and $U$ is uniform on $[0,1]$ then the random variable defined by $X=F^{-1}(U)$ has c.d.f. $F(x)$.
\subsection*{9.5: Theorem 36}
Let $X \sim N(\mu,\sigma^2)$ and define $Z=\frac{(X-\mu)}{\sigma}$. Then $Z\sim N(0,1)$ and\\
$F_X(x)=P(X\leq x) = F_Z(\frac{x-\mu}{\sigma})$.
\subsection*{9.6: Theorem 37}
If $X_1,X_2,...,X_n$ are independent random variables all having the same distribtion, with mean $\mu$ and variance $\sigma^2$, then as $n \rightarrow \infty$, the c.d.f. of the the random variable \\
$\frac{\sum X_i-n\mu}{\sigma\sqrt{n}}$\\
approaches the $N(0,1)$ c.d.f. Similarly, the c.d.f. of\\
$\frac{\overline{X}-\mu}{\frac{\sigma}{\sqrt{n}}}$\\
approaches the standard normal c.d.f.
\subsection*{9.6: Theorem 38}
Let $X$ have a binomial distribution, $Bi(n,p)$. Then for $n$ large, the random variable\\
$W=\frac{X-np}{\sqrt{np(1-p)}}$\\
is approximately $N(0,1)$.
%VVVVVVVVVVVVVVVVVVVVV DEFINITIONS VVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
\section*{Definitions}
\subsection*{7.1: Definition 13}
The {\bf median} of a sample is a value such that half the results are below it and half above it, when the results are arranged in numerical order.
\subsection*{7.1: Definition 14}
The {\bf mode} of the sample is the value which occurs most often. There is no guarantee there will be only a single mode.
\subsection*{7.2: Definition 15}
The {\bf expected value} (also called the mean or the expectation of a discrete random variable $X$ with probability function $f(x)$ is \\
$E(X)=\sum_{all x}xf(x)$
\subsection*{7.4: Definition 17}
The {\bf variance} of a r.v $X$ is $E[(X-\mu)^2]$, and is denoted by $\sigma^2$ or by $Var(X)$
\subsection*{7.4: Definition 18}
The {\bf standard deviation } of a random variable $X$ is $\sigma = \sqrt[]{E[(x-\mu)^2]}$
\begin{enumerate}
\item $\sigma^2=E(X^2)-\mu^2$
\item $\sigma^2=E[X(X-1)]+\mu-\mu^2$
\end{enumerate}
\subsection*{7.5: Definition 19}
Consider a discrete random variable $X$ with probability function $f(x)$. The {\bf moment generating function (m.g.f)} of $X$ is defined as\\
$ M(t)=E(e^{tX})=\sum_xe^{tx}f(x)$\\
We will assume that the moment generating function is defined and finite for values of $t$ in an interval around 0 (i.e. for some $a>0$, $\sum_xd^{tX}f(x)<\infty$ for all $t \in [-a,a])$.
\subsection*{8.1: Definition 21}
$X$ and $Y$ are {\bf independent} random variables iff $f(x,y) = f_1(x)f_2(y)$ for all values $(x,y)$
\subsection*{8.1: Definiton 22}
In general, $X_1,...,X_n$ are independent random variables iff\\
$f(x_1,...,x_n)=f_1(x_1)f_2(x_2)...f_n(x_n)$ for all $x_1,...,x_n$
\subsection*{8.1: Definition 23}
The conditinal probability function of $X$ given $Y=y$ is $f(x|y)=\frac{f(x,y)}{f_2(y)}$.\\
Similarly, $f(y|x)=\frac{f(x,y)}{f_1(x)}$ (provided, of course, the denominator is not zero).
\subsection*{8.3: Definition 24}
A stationary distribution of a Markov chain is the column vector ($\pi$ say) of probabilities of the individual states such that $\pi^TP=\pi^T$.
\subsection*{8.4: Definition 25}
$E[g(X,Y)]= \sum_{all(x,y)} g(x,y)f(x,y)$\\
and\\
$E[g(X_1,...,X_n)]= \sum_{all(x_1,...,x_n)} g(x_1,...,x_n)f(x_1,...,x_n)$\\
\subsection*{8.4: Definition 26 - Covariance }
The {\bf Covariance} of X and Y, denoted Cov(X,Y) or $\sigma_{XY}, is$\\
$Cov(X,Y)=E[(X-\mu_X)(Y-\mu_Y)]$\\
For Calculation purposes this definition is usually harder to use than the formula which follows:\\
$Cov(X,Y)=E(XY)-E(X)E(Y)$\\
(Note that this is the result of using {\bf "8.4: Property of Multivariate Expectation"})
\subsection*{8.4: Definition 29 - Correlation Coefficient}
The {\bf correlation coefficient} of $X$ and $Y$ is:\\
$\rho = \frac{Cov(X,Y)}{\sigma_X\sigma_Y}$
\subsection*{8.6: Definition 30}
The {\bf joint moment generating function} of $(X,Y)$ is \\
$M(s,t)=E\{e^{sX+tY} \}$\\
Recall that if $X,Y$ happen to be independent, $g_1(X)$ and $g_2(Y)$ are any two functions,\\
$E[g_1(X)g_2(Y)]=E[g_1(X)]E[g_2(Y)]$.\\
and so with $g_1(X)=e^{sX} $ and $g_2(Y) = e^{tY}$ we obtain, for independent random variables $X,Y$\\
$M(s,t) = M_X(s)M_Y(t)$
\subsection*{9.1: Definition 32 - Probability Density Function}
Th {\bf probability density functions} (p.d.f.) $f(x)$ for a continuous random variable $X$ is the derivative\\
$f(x)= \frac {dF(x)}{dx}$\\
where $F(x)$ is the c.d.f for $X$.
\subsection*{9.1: Definition 33 - Extension of Expectation, Mean, and Variance to Continuous Distributions}
When $X$ is continuous, we still define\\
$E(g(X))=\int_{all\; x} \!g(x)f(x)\; \mathrm{d}x$.\\
With this definition, all of the earlier properties of expected value and variance still hold; for example with $\mu=E(X)$,\\
$\sigma^2 = Var(X)=E[(X-\mu)^2]=E(X^2)-\mu^2$.
\subsection*{9.3: Definition 34 - The Gamma Function}
$\Gamma (a) = \int_0^\infty \!x^{a-1}e^{-x} \mathrm{d}x$ is called the gamma function of $a$, wherer $a>0$\\
{\bf Properties of the gamma function:}\\
\begin{enumerate}
\item $\Gamma(a)= (a-1)\Gamma (a-1)$ for $a>1$\\
\item $\Gamma(a)=(a-1)!$ if $a$ is a positive integer\\
\item $\Gamma(\frac{1}{2})=\sqrt{\pi}$
\end{enumerate}
\end{document}