In this repository we provide the code for the BCF-IV and BCF-ITT functions of the paper "Heterogeneous causal effects with imperfect compliance: a Bayesian machine learning approach" by F.J. Bargagli-Stoffi, K. De Witte and G. Gnecco published in The Annals of Applied Statistics.
The article has also been covered and summarized in two blog posts on R-bloggers and YoungStatS . Check them out for a coincise summary of the main novelties introduced in the paper.
Installing the latest developing version:
library(devtools)
install_github("fbargaglistoffi/BCF-IV", ref="master")
Import:
library("BayesIV")
Attention: BayesIV
depends on bcf
package, which, unfortunately, has just been removed from CRAN due to an un-addressed Issue. In order to run BayesIV
package, manually install bcf
package from GitHub following its installation guideline.
The bcf-iv function discovers and estimates, in an interpretable manner, the effects heterogeneity in settings where the assignment mechanism is irregular (e.g., instrumental variable and fuzzy regression discontinuity scenarios). This function is directly built to discover and estimate the heterogeneity in the Complier Average Treatment Effects (CACE). The function takes as inputs:
y
: the outcome variable;w
: the reception of the treatment variable (binary);z
: the assignment to the treatment variable (binary);x
: the covariate matrix;binary
:TRUE
if the outcome variable is binary,FALSE
otherwise (default: FALSE);n_burn
: the number of iterations discarded by the BCF-IV algorithm for the burn-in (default: 500);n_sim
: the number of iterations used by the BCF-IV algorithm to get the posterior distribution of the estimands (default: 500);inference_ratio
: the ratio of observations to be assigned to the interence subsample (default: 0.5);max_depth
: the maximal depth of the tree generated by the function (default: 2);- '
cp
: complexity parameter for the generated CART (default: 0.01); minsplit
: minimum observations needed to perform a binary split in the tree (default: 10);adj_method
: p-value adjustment method, options are "holm", bonferroni", "hockberg", "hommel", "BH", "BY", "fdr", "none" (default: "holm");seed
: random seed for reproducible results (default: 42).
The bcf_iv function returns the discovered sub-population, the conditional complier average treatment effect (CCACE), the p-value for this effect, the p-value for a weak-instrument test, the adjusted p-value, the proportion of compliers, the conditional intention-to-treat effect (CITT) and the proportion of compliers in the node.
The bcf-itt function discovers the heterogeneity in the intention-to-treat (ITT) and then estimates the effect both for the conditional ITT and the conditional CACE for the discovered subgroups. The function takes as inputs:
y
: the outcome variable;w
: the reception of the treatment variable (binary);z
: the assignment to the treatment variable (binary);x
: the covariate matrix;binary
:TRUE
if the outcome variable is binary,FALSE
otherwise (default: FALSE);max_depth
: the maximal depth of the generated CART (default: 2);n_burn
: the number of iterations discarded by the BCF-IV algorithm for the burn-in (default: 500);n_sim
: the number of iterations used by the BCF-IV algorithm to get the posterior distribution of the estimands (default: 500);inference_ratio
: the ratio of observations to be assigned to the interence subsample (default: 0.5);seed
: random seed for reproducible results (default: 42).
The bcf_itt function returns the discovered sub-population, the conditional complier average treatment effect (CCACE), the conditional intention-to-treat (CITT), the p-value for this effect, the p-value for a weak-instrument test, the adjusted p-value, the proportion of compliers, the conditional intention-to-treat effect (CITT) and the proportion of compliers in the node.
# Generate the dataset
dataset <- generate_dataset(n = 1000,
p = 10,
rho = 0,
null = 0,
effect_size = 2,
compliance = 0.75)
y <- dataset[["y"]]
w <- dataset[["w"]]
z <- dataset[["z"]]
X <- dataset[["X"]]
# BCF-IV
bcf_iv(y, w, z, X,
n_burn = 2000,
n_sim = 2000,
inference_ratio = 0.5,
binary = FALSE,
max_depth = 2,
adj_method = "holm")
# BCF-ITT
bcf_itt(y, w, z, X,
n_burn= 2000,
n_sim = 2000,
inference_ratio = 0.5,
binary = FALSE,
max_depth = 2)
For more exaustive synthetic examples check the folder
examples/
.
- Bargagli-Stoffi, F.J., De Witte, K. and Gnecco, G., 2022. Heterogeneous causal effects with imperfect compliance: a Bayesian machine learning approach. The Annals of Applied Statistics, 16(3), pp.1986-2009. [paper] [preprint]
@article{bargagli2022heterogeneous,
title={{Heterogeneous causal effects with imperfect compliance: a Bayesian machine learning approach}},
author={Bargagli-Stoffi, Falco J and De Witte, Kristof and Gnecco, Giorgio},
journal={The Annals of Applied Statistics},
volume={16},
number={3},
pages={1986--2009},
year={2022},
publisher={Institute of Mathematical Statistics}
}