Add "Proof of the Central Limit Theorem Using Characteristing Functions"

victorliu5296 · Sep 28, 2024 · 9bee231 · 9bee231
1 parent 9d1da23
commit 9bee231
Showing 1 changed file with 168 additions and 0 deletions.
diff --git a/.../content/posts/Proof of Central Limit Theorem using Characteristic Functions.md b/.../content/posts/Proof of Central Limit Theorem using Characteristic Functions.md
@@ -0,0 +1,168 @@
+---
+title: 'Proof of Central Limit Theorem Using Characteristic Functions'
+date: 2024-09-28T10:14:42-04:00
+summary: "A fairly rigorous proof of the Central Limit Theorem (CLT) using characteristic functions."
+math: katex
+categories:
+  - Statistics
+tags:
+  - Central Limit Theorem
+  - Characteristic Function
+  - Probability Distribution
+  - Normal Distribution
+  - Gaussian Distribution
+weight: 100
+draft: false
+---
+
+There is an abundance of proofs of the Central Limit Theorem (CLT) using either moment-generating functions or characteristic functions. However, they are often quite difficult to follow because they rely on hand-waving and skip a lot of justification for the steps.
+
+This post offers a detailed step-by-step proof of the Central Limit Theorem using characteristic functions.
+
+## Proof of the Central Limit Theorem Using Characteristic Functions
+
+### **Statement of the Central Limit Theorem**
+
+Let \( X_1, X_2, \ldots, X_n \) be independent and identically distributed (i.i.d.) random variables with mean \( \mu \) and variance \( \sigma^2 > 0 \). Define:
+
+\[
+S_n = \sum_{i=1}^n X_i
+\]
+\[
+Z_n = \frac{S_n - n\mu}{\sigma\sqrt{n}}
+\]
+
+Here, \( S_n \) is the sum of \( n \) i.i.d. random variables, and \( Z_n \) is the normalized sum of \( n \) i.i.d. random variables (i.e., it has mean \( 0 \) and variance \( 1 \)).
+
+The Central Limit Theorem states that as \( n \to \infty \), the distribution of \( Z_n \) converges in distribution to a standard normal random variable \( Z \sim N(0,1) \).
+
+### **Proof Using Characteristic Functions**
+
+#### **Step 1: Define the Characteristic Function**
+
+The characteristic function \( \phi_X(t) \) of a random variable \( X \) is defined as:
+\[
+\phi_X(t) = E[e^{itX}]
+\]
+
+#### **Step 2: Properties of Characteristic Functions**
+
+We use the following key properties of characteristic functions:
+1. For independent random variables \( X \) and \( Y \), \( \phi_{X+Y}(t) = \phi_X(t) \phi_Y(t) \).
+2. For a constant \( a \), \( \phi_{aX}(t) = \phi_X(at) \).
+3. For a constant \( b \), \( \phi_{X+b}(t) = e^{itb} \phi_X(t) \).
+
+#### **Step 3: Characteristic Function of \( Z_n \)**
+
+Using the properties of characteristic functions, we express the characteristic function of \( Z_n \):
+
+\[
+\phi_{Z_n}(t) = E\left[ e^{itZ_n} \right] = E\left[ e^{it \frac{S_n - n\mu}{\sigma \sqrt{n}}} \right] = e^{-it \frac{n\mu}{\sigma \sqrt{n}}} \phi_{S_n} \left( \frac{t}{\sigma \sqrt{n}} \right)
+\]
+
+#### **Step 4: Characteristic Function of \( S_n \)**
+
+Since \( S_n \) is the sum of \( n \) i.i.d. random variables, the characteristic function of \( S_n \) can be written as:
+\[
+\phi_{S_n}(t) = \left( \phi_X(t) \right)^n
+\]
+where \( \phi_X(t) \) is the characteristic function of the individual random variables \( X_i \).
+
+#### **Step 5: Taylor Expansion of \( \phi_X(t) \)**
+
+We now expand \( \phi_X(t) \) around \( t = 0 \) using a Taylor expansion:
+\[
+\phi_X(t) = 1 + it\mu - \frac{t^2}{2}(\sigma^2 + \mu^2) + o(t^2)
+\]
+where \( o(t^2) \) represents terms that vanish faster than \( t^2 \) as \( t \to 0 \). We will see that we don't need to expand this term because it vanishes as \( n \to \infty \).
+
+#### **Step 6: Substitute into \( \phi_{Z_n}(t) \)**
+
+Substituting the Taylor expansion of \( \phi_X(t) \) into the expression for \( \phi_{Z_n}(t) \), we get:
+\[
+\begin{aligned}
+\phi_{Z_n}(t) &= e^{-it \frac{n\mu}{\sigma \sqrt{n}}} \left( 1 + i\frac{t}{\sigma \sqrt{n}} \mu - \frac{t^2}{2\sigma^2 n} (\sigma^2 + \mu^2) + o\left( \frac{1}{n} \right) \right)^n
+\\
+&= e^{-it \frac{n\mu}{\sigma \sqrt{n}}} \left( 1 + \frac{1}{n} \left(i\frac{t}{\sigma \sqrt{n}} n\mu - \frac{t^2}{2\sigma^2} (\sigma^2 + \mu^2) + n o\left( \frac{1}{n} \right) \right) \right)^n
+\end{aligned}
+\]
+
+#### **Step 7: Simplify and Take Limit**
+
+Recall the limit form of the exponential function:
+
+\[
+e^z = \lim_{n \to \infty} \left( 1 + \frac{z}{n} \right)^n
+\]
+
+In our earlier expression:
+
+\[
+\phi_{Z_n}(t) = e^{-it \frac{n\mu}{\sigma \sqrt{n}}} \left( 1 + \frac{1}{n} \textcolor{red}{\left(i\frac{t}{\sigma \sqrt{n}} n\mu - \frac{t^2}{2\sigma^2} (\sigma^2 + \mu^2) + n o\left( \frac{1}{n} \right) \right)} \right)^n
+\]
+
+As \( n \to \infty \), we can use the limit form of \( e^z \) and simplify the expression.
+
+\[
+\lim_{n \to \infty} \phi_{Z_n}(t) = e^{-it \frac{n\mu}{\sigma \sqrt{n}}} \cdot e^{i \frac{t}{\sigma \sqrt{n}} n \mu - \frac{t^2}{2} + no\left( \frac{1}{n} \right)}
+\]
+
+\[
+= e^{-\frac{t^2}{2}} \cdot e^{o(1)}
+\]
+
+Here, \( n o\left( \frac{1}{n} \right) = o(1) \). Since, by definition, \( o(1) \to 0 \) as \( n \to \infty \), then \( e^{o(1)} \to 1 \) as \( n \to \infty \), and we have:
+
+\[
+\lim_{n \to \infty} \phi_{Z_n}(t) = e^{-\frac{t^2}{2}}
+\]
+
+#### **Step 8: Recognize the Limit**
+
+The function \( e^{-\frac{t^2}{2}} \) is the characteristic function of a standard normal distribution \( N(0,1) \).
+
+#### **Step 9: Apply Lévy's Continuity Theorem**
+
+Lévy's Continuity Theorem states that if a sequence of characteristic functions converges pointwise to a function that is continuous at 0, then the corresponding random variables converge in distribution to a random variable with that characteristic function.
+
+Since \( \phi_{Z_n}(t) \to e^{-\frac{t^2}{2}} \) as \( n \to \infty \), and \( e^{-\frac{t^2}{2}} \) is the characteristic function of \( N(0,1) \), we conclude that \( Z_n \) converges in distribution to a standard normal random variable.
+
+### **Conclusion**
+
+We have shown that the characteristic function of \( Z_n \) converges to the characteristic function of a standard normal random variable. By applying Lévy's Continuity Theorem, we conclude that \( Z_n \) converges in distribution to \( N(0,1) \), thus proving the Central Limit Theorem.
+
+---
+
+### **Note on Characteristic Functions and Probability Distributions**
+
+The bijection between characteristic functions and probability distributions can be rigorously analyzed and proved via the Fourier transform and its inverse. Specifically, the characteristic function \( \phi_X(t) \) of a random variable \( X \) is the Fourier transform of the probability density function (if it exists) or the probability distribution function of \( X \). This connection is central to the study of convergence in distribution and to results such as the Central Limit Theorem.
+
+#### **Fourier Transform and Inversion**
+
+Given a probability density function \( f_X(x) \) for a random variable \( X \), its characteristic function is defined as:
+\[
+\phi_X(t) = \int_{-\infty}^{\infty} e^{itx} f_X(x) \, dx
+\]
+This is exactly the Fourier transform of the probability density function \( f_X(x) \).
+
+Conversely, if we know the characteristic function \( \phi_X(t) \), the probability density function \( f_X(x) \) can be recovered using the inverse Fourier transform:
+\[
+f_X(x) = \frac{1}{2\pi} \int_{-\infty}^{\infty} e^{-itx} \phi_X(t) \, dt
+\]
+Thus, there is a direct and invertible relationship between the characteristic function and the probability density function.
+
+#### **Uniqueness and Continuity**
+
+One of the key results in probability theory is that the characteristic function uniquely determines the distribution of a random variable. This is a consequence of the fact that the Fourier transform is injective, meaning different probability distributions have distinct characteristic functions.
+
+Furthermore, the continuity of characteristic functions plays an important role in proving convergence in distribution. Lévy’s Continuity Theorem relies on this fact: if a sequence of characteristic functions converges pointwise to a limiting function that is continuous at \( t = 0 \), then this limiting function is the characteristic function of some random variable, and the corresponding sequence of random variables converges in distribution to that random variable.
+
+#### **Application to Central Limit Theorem**
+
+In the proof of the Central Limit Theorem, we showed that the characteristic function of the normalized sum \( Z_n \), \( \phi_{Z_n}(t) \), converges to the characteristic function of the standard normal distribution:
+\[
+\lim_{n \to \infty} \phi_{Z_n}(t) = e^{-\frac{t^2}{2}}
+\]
+Since the Fourier transform is bijective, this convergence implies that the probability distributions of \( Z_n \) converge to the normal distribution \( N(0,1) \). The inverse Fourier transform ensures that the limiting characteristic function corresponds to the standard normal distribution, completing the argument.
+
+In this way, the theory of Fourier transforms and characteristic functions not only provides a powerful tool for analyzing convergence in distribution but also guarantees the unique correspondence between characteristic functions and probability distributions.