diff --git a/lectures/applications/heterogeneity.md b/lectures/applications/heterogeneity.md index 84bb4cfa..28e91f80 100644 --- a/lectures/applications/heterogeneity.md +++ b/lectures/applications/heterogeneity.md @@ -101,12 +101,12 @@ When treatment is randomly assigned, we can estimate average treatment effects because $$ -\begin{align*} +\begin{aligned} E[y_i(1) - y_i(0) ] = & E[y_i(1)] - E[y_i(0)] \\ & \text{random assignment } \\ = & E[y_i(1) | d_i = 1] - E[y_i(0) | d_i = 0] \\ = & E[y_i | d_i = 1] - E[y_i | d_i = 0 ] -\end{align*} +\end{aligned} $$ ### Average Treatment Effects @@ -164,12 +164,12 @@ logic that lets us estimate unconditional average treatment effects also suggests that we can estimate conditional average treatment effects. $$ -\begin{align*} +\begin{aligned} E[y_i(1) - y_i(0) |X_i=x] = & E[y_i(1)|X_i = x] - E[y_i(0)|X_i=x] \\ & \text{random assignment } \\ = & E[y_i(1) | d_i = 1, X_i=x] - E[y_i(0) | d_i = 0, X_i=x] \\ = & E[y_i | d_i = 1, X_i = x] - E[y_i | d_i = 0, X_i=x ] -\end{align*} +\end{aligned} $$ Conditional average treatment effects tell us whether there are @@ -209,7 +209,6 @@ $S(x)$ approximates $s_0(x)$ is to look at the best linear projection of $s_0(x)$ on $S(x)$. $$ -\DeclareMathOperator*{\argmin}{arg\,min} \beta_0, \beta_1 = \argmin_{b_0,b_1} E[(s_0(x) - b_0 - b_1 (S(x)-E[S(x)]))^2] $$ diff --git a/lectures/applications/regression.md b/lectures/applications/regression.md index 0545d969..e3d5f81e 100644 --- a/lectures/applications/regression.md +++ b/lectures/applications/regression.md @@ -123,7 +123,7 @@ only the livable square footage of the home. The linear regression model for this situation is $$ -\log(\text{price}) = \beta_0 + \beta_1 \text{sqft_living} + \epsilon +\log(\text{price}) = \beta_0 + \beta_1 \text{sqft\_living} + \epsilon $$ $\beta_0$ and $\beta_1$ are called parameters (also coefficients or @@ -132,14 +132,14 @@ that best fit the data. $\epsilon$ is the error term. It would be unusual for the observed $\log(\text{price})$ to be an exact linear function of -$\text{sqft_living}$. The error term captures the deviation of -$\log(\text{price})$ from a linear function of $\text{sqft_living}$. +$\text{sqft\_living}$. The error term captures the deviation of +$\log(\text{price})$ from a linear function of $\text{sqft\_living}$. The linear regression algorithm will choose the parameters that minimize the *mean squared error* (MSE) function, which for our example is written. $$ -\frac{1}{N} \sum_{i=1}^N \left(\log(\text{price}_i) - (\beta_0 + \beta_1 \text{sqft_living}_i) \right)^2 +\frac{1}{N} \sum_{i=1}^N \left(\log(\text{price}_i) - (\beta_0 + \beta_1 \text{sqft\_living}_i) \right)^2 $$ The output of this algorithm is the straight line (hence linear) that passes as @@ -218,7 +218,7 @@ Suppose that in addition to `sqft_living`, we also wanted to use the `bathrooms` In this case, the linear regression model is $$ -\log(\text{price}) = \beta_0 + \beta_1 \text{sqft_living} + +\log(\text{price}) = \beta_0 + \beta_1 \text{sqft\_living} + \beta_2 \text{bathrooms} + \epsilon $$ @@ -227,7 +227,7 @@ We could keep adding one variable at a time, along with a new $\beta_{j}$ coeffi Let's write this equation in vector/matrix form as $$ -\underbrace{\begin{bmatrix} \log(\text{price}_1) \\ \log(\text{price}_2) \\ \vdots \\ \log(\text{price}_N)\end{bmatrix}}_Y = \underbrace{\begin{bmatrix} 1 & \text{sqft_living}_1 & \text{bathrooms}_1 \\ 1 & \text{sqft_living}_2 & \text{bathrooms}_2 \\ \vdots & \vdots & \vdots \\ 1 & \text{sqft_living}_N & \text{bathrooms}_N \end{bmatrix}}_{X} \underbrace{\begin{bmatrix} \beta_0 \\ \beta_1 \\ \beta_2 \end{bmatrix}}_{\beta} + \epsilon +\underbrace{\begin{bmatrix} \log(\text{price}_1) \\ \log(\text{price}_2) \\ \vdots \\ \log(\text{price}_N)\end{bmatrix}}_Y = \underbrace{\begin{bmatrix} 1 & \text{sqft\_living}_1 & \text{bathrooms}_1 \\ 1 & \text{sqft\_living}_2 & \text{bathrooms}_2 \\ \vdots & \vdots & \vdots \\ 1 & \text{sqft\_living}_N & \text{bathrooms}_N \end{bmatrix}}_{X} \underbrace{\begin{bmatrix} \beta_0 \\ \beta_1 \\ \beta_2 \end{bmatrix}}_{\beta} + \epsilon $$ Notice that we can add as many columns to $X$ as we'd like and the linear diff --git a/lectures/problem_sets/problem_set_3.md b/lectures/problem_sets/problem_set_3.md index baa5fb79..dfcc6cb2 100644 --- a/lectures/problem_sets/problem_set_3.md +++ b/lectures/problem_sets/problem_set_3.md @@ -197,10 +197,10 @@ face value $M$, yield to maturity $i$, and periods to maturity $N$ is $$ -\begin{align*} +\begin{aligned} P &= \left(\sum_{n=1}^N \frac{C}{(i+1)^n}\right) + \frac{M}{(1+i)^N} \\ &= C \left(\frac{1 - (1+i)^{-N}}{i} \right) + M(1+i)^{-N} -\end{align*} +\end{aligned} $$ In the code cell below, we have defined variables for `i`, `M` and `C`. diff --git a/lectures/python_fundamentals/functions.md b/lectures/python_fundamentals/functions.md index c0c460c0..cdf28262 100644 --- a/lectures/python_fundamentals/functions.md +++ b/lectures/python_fundamentals/functions.md @@ -633,10 +633,10 @@ that can be interchanged. That is, the following are identical. $$ -\begin{eqnarray} +\begin{aligned} f(K, L) &= z\, K^{\alpha} L^{1-\alpha}\\ f(K_2, L_2) &= z\, K_2^{\alpha} L_2^{1-\alpha} -\end{eqnarray} +\end{aligned} $$ The same concept applies to Python functions, where the arguments are just diff --git a/lectures/scientific/applied_linalg.md b/lectures/scientific/applied_linalg.md index 44e1e2fa..45e7cecc 100644 --- a/lectures/scientific/applied_linalg.md +++ b/lectures/scientific/applied_linalg.md @@ -343,11 +343,11 @@ $\begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}$ then we can multiply both sides b to get $$ -\begin{align*} +\begin{aligned} \begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}^{-1}\begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} &= \begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}^{-1}\begin{bmatrix} 3 \\ 4 \end{bmatrix} \\ I \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} &= \begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}^{-1} \begin{bmatrix} 3 \\ 4 \end{bmatrix} \\ \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} &= \begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}^{-1} \begin{bmatrix} 3 \\ 4 \end{bmatrix} -\end{align*} + \end{aligned} $$ Computing the inverse requires that a matrix be square and satisfy some other conditions diff --git a/lectures/scientific/numpy_arrays.md b/lectures/scientific/numpy_arrays.md index a01addc6..072e5878 100644 --- a/lectures/scientific/numpy_arrays.md +++ b/lectures/scientific/numpy_arrays.md @@ -521,10 +521,10 @@ face value $M$, yield to maturity $i$, and periods to maturity $N$ is $$ -\begin{align*} +\begin{aligned} P &= \left(\sum_{n=1}^N \frac{C}{(i+1)^n}\right) + \frac{M}{(1+i)^N} \\ &= C \left(\frac{1 - (1+i)^{-N}}{i} \right) + M(1+i)^{-N} -\end{align*} +\end{aligned} $$ In the code cell below, we have defined variables for `i`, `M` and `C`.