Doc and vignette updates

ngreifer · Nov 12, 2024 · 57426b3 · 57426b3
1 parent 9282bb9
commit 57426b3
Show file tree

Hide file tree

Showing 4 changed files with 16 additions and 28 deletions.
diff --git a/man/matchit.Rd b/man/matchit.Rd
diff --git a/vignettes/assessing-balance.Rmd b/vignettes/assessing-balance.Rmd
@@ -281,7 +281,7 @@ love.plot(m.out, stats = c("m", "ks"), poly = 2, abs = TRUE,
           position = "bottom")
 ```
 
-The `love.plot()` documentation explains what each of these arguments do and the several other ones available. See `vignette("cobalt_A4_love.plot", package = "cobalt")` for other advanced customization of `love.plot()`.
+The `love.plot()` documentation explains what each of these arguments do and the several other ones available. See `vignette("love.plot", package = "cobalt")` for other advanced customization of `love.plot()`.
 
 ### `bal.plot()`
 

diff --git a/vignettes/matching-methods.Rmd b/vignettes/matching-methods.Rmd
@@ -32,7 +32,7 @@ Matching is nonparametric in the sense that the estimated weights and pruning of
 
 It is important to note that this implementation of matching differs from the methods described by Abadie and Imbens [-@abadie2006; -@abadie2016] and implemented in the `Matching` R package and `teffects` routine in Stata. That form of matching is *matching imputation*, where the missing potential outcomes for each unit are imputed using the observed outcomes of paired units. This is a critical distinction because matching imputation is a specific estimation method with its own effect and standard error estimators, in contrast to subset selection, which is a preprocessing method that does not require specific estimators and is broadly compatible with other parametric and nonparametric analyses. The benefits of matching imputation are that its theoretical properties (i.e., the rate of convergence and asymptotic variance of the estimator) are well understood, it can be used in a straightforward way to estimate not just the average treatment effect in the treated (ATT) but also the average treatment effect in the population (ATE), and additional effective matching methods can be used in the imputation (e.g., kernel matching). The benefits of matching as nonparametric preprocessing are that it is far more flexible with respect to the types of effects that can be estimated because it does not involve any specific estimator, its empirical and finite-sample performance has been examined in depth and is generally well understood, and it aligns well with the design of experiments, which are more familiar to non-technical audiences.
 
-In addition to subset selection, matching often (though not always) involves a form of *stratification*, the assignment of units to pairs or strata containing multiple units. The distinction between subset selection and stratification is described by @zubizarreta2014, who separate them into two separate steps. In `MatchIt`, with almost all matching methods, subset selection is performed by stratification; for example, treated units are paired with control units, and unpaired units are then dropped from the matched sample. With some methods, subclasses are used to assign matching or stratification weights to individual units, which increase or decrease each unit's leverage in a subsequent analysis. There has been some debate about the importance of stratification after subset selection; while some authors have argued that, with some forms of matching, pair membership is incidental [@stuart2008; @schafer2008], others have argued that correctly incorporating pair membership into effect estimation can improve the quality of inferences [@austin2014a; @wan2019]. For methods that allow it, `MatchIt` includes stratum membership as an additional output of each matching specification. How these strata can be used is detailed in `vignette("Estimating Effects")`.
+In addition to subset selection, matching often (though not always) involves a form of *stratification*, the assignment of units to pairs or strata containing multiple units. The distinction between subset selection and stratification is described by @zubizarreta2014, who separate them into two separate steps. In `MatchIt`, with almost all matching methods, subset selection is performed by stratification; for example, treated units are paired with control units, and unpaired units are then dropped from the matched sample. With some methods, subclasses are used to assign matching or stratification weights to individual units, which increase or decrease each unit's leverage in a subsequent analysis. There has been some debate about the importance of stratification after subset selection; while some authors have argued that, with some forms of matching, pair membership is incidental [@stuart2008; @schafer2008], others have argued that correctly incorporating pair membership into effect estimation can improve the quality of inferences [@austin2014a; @wan2019]. For methods that allow it, `MatchIt` includes stratum membership as an additional output of each matching specification. How these strata can be used is detailed in `vignette("estimating-effects")`.
 
 At the heart of `MatchIt` are three classes of methods: distance matching, stratum matching, and pure subset selection. *Distance matching* involves considering a focal group (usually the treated group) and selecting members of the non-focal group (i.e., the control group) to pair with each member of the focal group based on the *distance* between units, which can be computed in one of several ways. Members of either group that are not paired are dropped from the sample. Nearest neighbor matching (`method = "nearest"`), optimal pair matching (`method = "optimal"`), optimal full matching (`method = "full"`), generalized full matching (`method = "quick"`), and genetic matching (`method = "genetic"`) are the methods of distance matching implemented in `MatchIt`. Typically, only the average treatment in the treated (ATT) or average treatment in the control (ATC), if the control group is the focal group, can be estimated after distance matching in `MatchIt` (full matching is an exception, described later).
 

diff --git a/vignettes/sampling-weights.Rmd b/vignettes/sampling-weights.Rmd
@@ -146,7 +146,7 @@ Note that had we not added sampling weights to `mF`, the matching specification
 
 ## Estimating the Effect
 
-Estimating the treatment effect after matching is straightforward when using sampling weights. Effects are estimated in the same way as when sampling weights are excluded, except that the matching weights must be multiplied by the sampling weights to yield accurate, generalizable estimates. `match.data()` and `get_matches()` do this automatically, so the weights produced by these functions already are a product of the matching weights and the sampling weights. Note this will only be true if sampling weights are incorporated into the `matchit` object.
+Estimating the treatment effect after matching is straightforward when using sampling weights. Effects are estimated in the same way as when sampling weights are excluded, except that the matching weights must be multiplied by the sampling weights for use in the outcome model to yield accurate, generalizable estimates. `match.data()` and `get_matches()` do this automatically, so the weights produced by these functions already are a product of the matching weights and the sampling weights. Note this will only be true if sampling weights are incorporated into the `matchit` object. With `avg_comparisons()`, only the sampling weights should be included when estimating the treatment effect.
 
 Below we estimate the effect of `A` on `Y_C` in the matched and sampling weighted sample, adjusting for the covariates to improve precision and decrease bias.
 
@@ -165,7 +165,7 @@ avg_comparisons(fit,
                 wts = "SW")
 ```
 
-Note that `match.data()` and `get_weights()` have the option `include.s.weights`, which, when set to `FALSE`, makes it so the returned weights do not incorporate the sampling weights and are simply the matching weights. Because one might to forget to multiply the two sets of weights together, it is easier to just use the default of `include.s.weights = TRUE` and ignore the sampling weights in the rest of the analysis (because they are already included in the returned weights). `avg_comparisons()` also works more smoothly when the weights supplied to `weights` is a single variable rather than the product of two.
+Note that `match.data()` and `get_weights()` have the option `include.s.weights`, which, when set to `FALSE`, makes it so the returned weights do not incorporate the sampling weights and are simply the matching weights. Because one might to forget to multiply the two sets of weights together, it is easier to just use the default of `include.s.weights = TRUE` and ignore the sampling weights in the rest of the analysis (because they are already included in the returned weights).
 
 ## Code to Generate Data used in Examples