Skip to content

Commit

Permalink
Add "Proof of the Central Limit Theorem Using Characteristing Functions"
Browse files Browse the repository at this point in the history
  • Loading branch information
victorliu5296 committed Sep 28, 2024
1 parent 9d1da23 commit 9bee231
Showing 1 changed file with 168 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
---
title: 'Proof of Central Limit Theorem Using Characteristic Functions'
date: 2024-09-28T10:14:42-04:00
summary: "A fairly rigorous proof of the Central Limit Theorem (CLT) using characteristic functions."
math: katex
categories:
- Statistics
tags:
- Central Limit Theorem
- Characteristic Function
- Probability Distribution
- Normal Distribution
- Gaussian Distribution
weight: 100
draft: false
---

There is an abundance of proofs of the Central Limit Theorem (CLT) using either moment-generating functions or characteristic functions. However, they are often quite difficult to follow because they rely on hand-waving and skip a lot of justification for the steps.

This post offers a detailed step-by-step proof of the Central Limit Theorem using characteristic functions.

## Proof of the Central Limit Theorem Using Characteristic Functions

### **Statement of the Central Limit Theorem**

Let \( X_1, X_2, \ldots, X_n \) be independent and identically distributed (i.i.d.) random variables with mean \( \mu \) and variance \( \sigma^2 > 0 \). Define:

\[
S_n = \sum_{i=1}^n X_i
\]
\[
Z_n = \frac{S_n - n\mu}{\sigma\sqrt{n}}
\]

Here, \( S_n \) is the sum of \( n \) i.i.d. random variables, and \( Z_n \) is the normalized sum of \( n \) i.i.d. random variables (i.e., it has mean \( 0 \) and variance \( 1 \)).

The Central Limit Theorem states that as \( n \to \infty \), the distribution of \( Z_n \) converges in distribution to a standard normal random variable \( Z \sim N(0,1) \).

### **Proof Using Characteristic Functions**

#### **Step 1: Define the Characteristic Function**

The characteristic function \( \phi_X(t) \) of a random variable \( X \) is defined as:
\[
\phi_X(t) = E[e^{itX}]
\]

#### **Step 2: Properties of Characteristic Functions**

We use the following key properties of characteristic functions:
1. For independent random variables \( X \) and \( Y \), \( \phi_{X+Y}(t) = \phi_X(t) \phi_Y(t) \).
2. For a constant \( a \), \( \phi_{aX}(t) = \phi_X(at) \).
3. For a constant \( b \), \( \phi_{X+b}(t) = e^{itb} \phi_X(t) \).

#### **Step 3: Characteristic Function of \( Z_n \)**

Using the properties of characteristic functions, we express the characteristic function of \( Z_n \):

\[
\phi_{Z_n}(t) = E\left[ e^{itZ_n} \right] = E\left[ e^{it \frac{S_n - n\mu}{\sigma \sqrt{n}}} \right] = e^{-it \frac{n\mu}{\sigma \sqrt{n}}} \phi_{S_n} \left( \frac{t}{\sigma \sqrt{n}} \right)
\]

#### **Step 4: Characteristic Function of \( S_n \)**

Since \( S_n \) is the sum of \( n \) i.i.d. random variables, the characteristic function of \( S_n \) can be written as:
\[
\phi_{S_n}(t) = \left( \phi_X(t) \right)^n
\]
where \( \phi_X(t) \) is the characteristic function of the individual random variables \( X_i \).

#### **Step 5: Taylor Expansion of \( \phi_X(t) \)**

We now expand \( \phi_X(t) \) around \( t = 0 \) using a Taylor expansion:
\[
\phi_X(t) = 1 + it\mu - \frac{t^2}{2}(\sigma^2 + \mu^2) + o(t^2)
\]
where \( o(t^2) \) represents terms that vanish faster than \( t^2 \) as \( t \to 0 \). We will see that we don't need to expand this term because it vanishes as \( n \to \infty \).

#### **Step 6: Substitute into \( \phi_{Z_n}(t) \)**

Substituting the Taylor expansion of \( \phi_X(t) \) into the expression for \( \phi_{Z_n}(t) \), we get:
\[
\begin{aligned}
\phi_{Z_n}(t) &= e^{-it \frac{n\mu}{\sigma \sqrt{n}}} \left( 1 + i\frac{t}{\sigma \sqrt{n}} \mu - \frac{t^2}{2\sigma^2 n} (\sigma^2 + \mu^2) + o\left( \frac{1}{n} \right) \right)^n
\\
&= e^{-it \frac{n\mu}{\sigma \sqrt{n}}} \left( 1 + \frac{1}{n} \left(i\frac{t}{\sigma \sqrt{n}} n\mu - \frac{t^2}{2\sigma^2} (\sigma^2 + \mu^2) + n o\left( \frac{1}{n} \right) \right) \right)^n
\end{aligned}
\]

#### **Step 7: Simplify and Take Limit**

Recall the limit form of the exponential function:

\[
e^z = \lim_{n \to \infty} \left( 1 + \frac{z}{n} \right)^n
\]

In our earlier expression:

\[
\phi_{Z_n}(t) = e^{-it \frac{n\mu}{\sigma \sqrt{n}}} \left( 1 + \frac{1}{n} \textcolor{red}{\left(i\frac{t}{\sigma \sqrt{n}} n\mu - \frac{t^2}{2\sigma^2} (\sigma^2 + \mu^2) + n o\left( \frac{1}{n} \right) \right)} \right)^n
\]

As \( n \to \infty \), we can use the limit form of \( e^z \) and simplify the expression.

\[
\lim_{n \to \infty} \phi_{Z_n}(t) = e^{-it \frac{n\mu}{\sigma \sqrt{n}}} \cdot e^{i \frac{t}{\sigma \sqrt{n}} n \mu - \frac{t^2}{2} + no\left( \frac{1}{n} \right)}
\]

\[
= e^{-\frac{t^2}{2}} \cdot e^{o(1)}
\]

Here, \( n o\left( \frac{1}{n} \right) = o(1) \). Since, by definition, \( o(1) \to 0 \) as \( n \to \infty \), then \( e^{o(1)} \to 1 \) as \( n \to \infty \), and we have:

\[
\lim_{n \to \infty} \phi_{Z_n}(t) = e^{-\frac{t^2}{2}}
\]

#### **Step 8: Recognize the Limit**

The function \( e^{-\frac{t^2}{2}} \) is the characteristic function of a standard normal distribution \( N(0,1) \).

#### **Step 9: Apply Lévy's Continuity Theorem**

Lévy's Continuity Theorem states that if a sequence of characteristic functions converges pointwise to a function that is continuous at 0, then the corresponding random variables converge in distribution to a random variable with that characteristic function.

Since \( \phi_{Z_n}(t) \to e^{-\frac{t^2}{2}} \) as \( n \to \infty \), and \( e^{-\frac{t^2}{2}} \) is the characteristic function of \( N(0,1) \), we conclude that \( Z_n \) converges in distribution to a standard normal random variable.

### **Conclusion**

We have shown that the characteristic function of \( Z_n \) converges to the characteristic function of a standard normal random variable. By applying Lévy's Continuity Theorem, we conclude that \( Z_n \) converges in distribution to \( N(0,1) \), thus proving the Central Limit Theorem.

---

### **Note on Characteristic Functions and Probability Distributions**

The bijection between characteristic functions and probability distributions can be rigorously analyzed and proved via the Fourier transform and its inverse. Specifically, the characteristic function \( \phi_X(t) \) of a random variable \( X \) is the Fourier transform of the probability density function (if it exists) or the probability distribution function of \( X \). This connection is central to the study of convergence in distribution and to results such as the Central Limit Theorem.

#### **Fourier Transform and Inversion**

Given a probability density function \( f_X(x) \) for a random variable \( X \), its characteristic function is defined as:
\[
\phi_X(t) = \int_{-\infty}^{\infty} e^{itx} f_X(x) \, dx
\]
This is exactly the Fourier transform of the probability density function \( f_X(x) \).

Conversely, if we know the characteristic function \( \phi_X(t) \), the probability density function \( f_X(x) \) can be recovered using the inverse Fourier transform:
\[
f_X(x) = \frac{1}{2\pi} \int_{-\infty}^{\infty} e^{-itx} \phi_X(t) \, dt
\]
Thus, there is a direct and invertible relationship between the characteristic function and the probability density function.

#### **Uniqueness and Continuity**

One of the key results in probability theory is that the characteristic function uniquely determines the distribution of a random variable. This is a consequence of the fact that the Fourier transform is injective, meaning different probability distributions have distinct characteristic functions.

Furthermore, the continuity of characteristic functions plays an important role in proving convergence in distribution. Lévy’s Continuity Theorem relies on this fact: if a sequence of characteristic functions converges pointwise to a limiting function that is continuous at \( t = 0 \), then this limiting function is the characteristic function of some random variable, and the corresponding sequence of random variables converges in distribution to that random variable.

#### **Application to Central Limit Theorem**

In the proof of the Central Limit Theorem, we showed that the characteristic function of the normalized sum \( Z_n \), \( \phi_{Z_n}(t) \), converges to the characteristic function of the standard normal distribution:
\[
\lim_{n \to \infty} \phi_{Z_n}(t) = e^{-\frac{t^2}{2}}
\]
Since the Fourier transform is bijective, this convergence implies that the probability distributions of \( Z_n \) converge to the normal distribution \( N(0,1) \). The inverse Fourier transform ensures that the limiting characteristic function corresponds to the standard normal distribution, completing the argument.

In this way, the theory of Fourier transforms and characteristic functions not only provides a powerful tool for analyzing convergence in distribution but also guarantees the unique correspondence between characteristic functions and probability distributions.

0 comments on commit 9bee231

Please sign in to comment.