-
-
Notifications
You must be signed in to change notification settings - Fork 38
/
15-marginal-effect.Rmd
301 lines (200 loc) · 9.67 KB
/
15-marginal-effect.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
# Marginal Effects
In cases without polynomials or interactions, it can be easy to interpret the marginal effect.
For example,
$$
Y = \beta_1 X_1 + \beta_2 X_2
$$
where $\beta$ are the marginal effects.
Numerical derivation is easier than analytical derivation.
- We need to choose values for all the variables to calculate the marginal effect of $X$
Analytical derivation
$$
f'(x) \equiv \lim_{h \to 0} \frac{f(x+h) - f(x)}{h}
$$
E.g., $f(x) = X^2$
$$
\begin{aligned}
f'(x) &= \lim_{h \to 0} \frac{(x+h)^2 - x^2}{h} \\
&= \frac{x^2 + 2xh + h^2 - x^2}{h} \\
&= \frac{2xh + h^2}{h} \\
&= 2x + h \\
&= 2x
\end{aligned}
$$
For numerically approach, we "just" need to find a small $h$ to plug in our function. However, you also need a large enough $h$ to have numerically accurate computation [@gould2010maximum, chapter 1]
Numerically approach
One-sided derivative
$$
\begin{aligned}
f'(x) &= \lim_{h \to 0} \frac{(x+h)^2 - x^2}{h} \\
& \approx \frac{f(x+h) -f(x)}{h}
\end{aligned}
$$
Alternatively, two-sided derivative
$$
f'_2(x) \approx \frac{f(x+h) - f(x- h)}{2h}
$$
Marginal effects for
- discrete variables (also known as incremental effects) are the change in $E[Y|X]$ for a one unit change in $X$
- continuous variables are the change in $E[Y|X]$ for very small changes in $X$ (not unit changes), because it's a derivative, which is a limit when $h \to 0$
| | Analytical derivation | Numerical derivation |
|-----------------|-----------------|---------------------------------------|
| Marginal Effects | Rules of expectations | Approximate analytical solution |
| Standard Errors | Rules of variances | Delta method using the asymptotic errors (vcov matrix) |
## Delta Method
- approximate the mean and variance of a function of random variables using a first-order Taylor approximation
- A semi-parametric method
- Alternative approaches:
- Analytically derive a probability function for the margin
- Simulation/Bootstrapping
- Resources:
- Advanced: [modmarg](https://cran.r-project.org/web/packages/modmarg/vignettes/delta-method.html)
- Intermediate: [UCLA stat](https://stats.oarc.ucla.edu/r/faq/how-can-i-estimate-the-standard-error-of-transformed-regression-parameters-in-r-using-the-delta-method/)
- Simple: [Another one](https://www.alexstephenson.me/post/2022-04-02-standard-errors-and-the-delta-method/)
Let $G(\beta)$ be a function of the parameters $\beta$, then
$$
var(G(\beta)) \approx \nabla G(\beta) cov (\beta) \nabla G(\beta)'
$$
where
- $\nabla G(\beta)$ = the gradient of the partial derivatives of $G(\beta)$ (also known as the Jacobian)
## Average Marginal Effect Algorithm
For one-sided derivative $\frac{\partial p(\mathbf{X},\beta)}{\partial X}$ in the probability scale
1. Estimate your model
2. For each observation $i$
1. Calculate $\hat{Y}_{i0}$ which is the prediction in the probability scale using observed values
2. Increase $X$ (variable of interest) by a "small" amount $h$ ($X_{new} = X + h$)
- When $X$ is continuous, $h = (|\bar{X}| + 0.001) \times 0.001$ where $\bar{X}$ is the mean value of $X$
- When $X$ is discrete, $h = 1$
3. Calculate $\hat{Y}_{i1}$ which is the prediction in the probability scale using new $X$ and other variables' observed vales.
4. Calculate the difference between the two predictions as fraction of $h$: $\frac{\bar{Y}_{i1} - \bar{Y}_{i0}}{h}$
3. Average numerical derivative is $E[\frac{\bar{Y}_{i1} - \bar{Y}_{i0}}{h}] \approx \frac{\partial p (Y|\mathbf{X}, \beta)}{\partial X}$
Two-sided derivatives
1. Estimate your model
2. For each observation $i$
1. Calculate $\hat{Y}_{i0}$ which is the prediction in the probability scale using observed values
2. Increase $X$ (variable of interest) by a "small" amount $h$ ($X_{1} = X + h$) and decrease $X$ (variable of interest) by a "small" amount $h$ ($X_{2} = X - h$)
- When $X$ is continuous, $h = (|\bar{X}| + 0.001) \times 0.001$ where $\bar{X}$ is the mean value of $X$
- When $X$ is discrete, $h = 1$
3. Calculate $\hat{Y}_{i1}$ which is the prediction in the probability scale using new $X_1$ and other variables' observed vales.
4. Calculate $\hat{Y}_{i2}$ which is the prediction in the probability scale using new $X_2$ and other variables' observed vales.
5. Calculate the difference between the two predictions as fraction of $h$: $\frac{\bar{Y}_{i1} - \bar{Y}_{i2}}{2h}$
3. Average numerical derivative is $E[\frac{\bar{Y}_{i1} - \bar{Y}_{i2}}{2h}] \approx \frac{\partial p (Y|\mathbf{X}, \beta)}{\partial X}$
```{r}
library(margins)
library(tidyverse)
data("mtcars")
mod <- lm(mpg ~ cyl * disp * hp, data = mtcars)
margins::margins(mod) %>% summary()
# function for variable
get_mae <- function(mod, var = "disp") {
data = mod$model
pred <- predict(mod)
if (class(mod$model[[{
{
var
}
}]]) == "numeric") {
h = (abs(mean(mod$model[[var]])) + 0.01) * 0.01
} else {
h = 1
}
data[[{
{
var
}
}]] <- data[[{
{
var
}
}]] - h
pred_new <- predict(mod, newdata = data)
mean(pred - pred_new) / h
}
get_mae(mod, var = "disp")
```
## Packages
### MarginalEffects
`MarginalEffects` package is a successor of `margins` and `emtrends` (faster, more efficient, more adaptable). Hence, this is advocated to be used.
- A limitation is that there is no readily function to correct for multiple comparisons. Hence, one can use the `p.adjust` function to overcome this disadvantage.
Definitions from the package:
- **Marginal effects** are slopes or derivatives (i.e., effect of changes in a variable on the outcome)
- `margins` package defines marginal effects as "partial derivatives of the regression equation with respect to each variable in the model for each unit in the data."
- **Marginal means** are averages or integrals (i.e., marginalizing across rows of a prediction grid)
To customize your plot using `plot_cme` (which is a `ggplot` class), you can check this [post](https://stackoverflow.com/questions/72463092/estimate-marginal-effect-in-triple-interaction) by the author of the `MarginalEffects` package
Causal inference with the parametric g-formula
- Because the plug-in g estimator is equivalent to the average contrast in the `marginaleffects` package.
To get predicted values
```{r}
library(marginaleffects)
library(tidyverse)
data(mtcars)
mod <- lm(mpg ~ hp * wt * am, data = mtcars)
predictions(mod) %>% head()
# for specific reference values
predictions(mod, newdata = datagrid(am = 0, wt = c(2, 4)))
plot_cap(mod, condition = c("hp","wt"))
```
```{r}
# Average Margianl Effects
mfx <- marginaleffects(mod, variables = c("hp", "wt"))
summary(mfx)
# Group-Average Marginal Effects
marginaleffects::marginaleffects(mod, by = "hp", variables = "am")
# Marginal effects at representative values
marginaleffects::marginaleffects(mod,
newdata = datagrid(am = 0,
wt = c(2, 4)))
# Marginal Effects at the Mean
marginaleffects::marginaleffects(mod, newdata = "mean")
```
```{r}
# counterfactual
comparisons(mod, variables = list(am = 0:1)) %>% summary()
```
### margins
- Marginal effects are partial derivative of the regression equation with respect to each variable in the model for each unit in the data
<!-- -->
- Average Partial Effects: the contribution of each variable the outcome scale, conditional on the other variables involved in the link function transformation of the linear predictor
- Average Marginal Effects: the marginal contribution of each variable on the scale of the linear predictor.
- Average marginal effects are the mean of these unit-specific partial derivatives over some sample
`margins` package gives the marginal effects of models (a replication of the `margins` command in Stata).
`prediction` package gives the unit-specific and sample average predictions of models (similar to the predictive margins in Stata).
```{r}
library(margins)
# examples by the package's authors
mod <- lm(mpg ~ cyl * hp + wt, data = mtcars)
summary(mod)
```
In cases where you have interaction or polynomial terms, the coefficient estimates cannot be interpreted as the marginal effects of X on Y. Hence, if you want to know the average marginal effects of each variable then
```{r}
summary(margins(mod))
# equivalently
margins_summary(mod)
plot(margins(mod))
```
Marginal effects at the mean (**MEM**):
- Marginal effects at the mean values of the sample
- For discrete variables, it's called average discrete change (**ADC**)
Average Marginal Effect (**AME**)
- An average of the marginal effects at each value of the sample
Marginal Effects at representative values (**MER**)
```{r}
margins(mod, at = list(hp = 150))
margins(mod, at = list(hp = 150)) %>% summary()
```
### mfx
Works well with [Generalized Linear Models]/`glm` package
For technical details, see the package [vignette](https://cran.rstudio.com/web/packages/mfx/vignettes/mfxarticle.pdf)
| Model | Dependent Variable | Syntax |
|-------------------|--------------------|--------------|
| Probit | Binary | `probitmfx` |
| Logit | Binary | `logitmfx` |
| Poisson | Count | `poissonmfx` |
| Negative Binomial | Count | `negbinmfx` |
| Beta | Rate | `betamfx` |
```{r}
library(mfx)
data("mtcars")
poissonmfx(formula = vs ~ mpg * cyl * disp, data = mtcars)
```
This package can only give the marginal effect for each variable in the `glm` model, but not the average marginal effect that we might look for.