Releases: py-why/dowhy
Bug fixes update
- Added an optimized version for
identify_effect
- Fixed a bug for direct and indirect effects computation
- More test coverage: Notebooks are also under automatic tests
- updated conditional-effects-notebook to support the latest EconML version
- EconML metalearners now have the expected behavior: accept both
common_causes
andeffect_modifiers
- Fixed some bugs in refuter tests
Enhanced documentation and support for causal mediation
Installation
- DoWhy can be installed on Conda now!
Code
- Support for identification by mediation formula
- Support for the front-door criterion
- Linear estimation methods for mediation
- Generalized backdoor criterion implementation using paths and d-separation
- Added GLM estimators, including logistic regression
- New API for interpreting causal models, estimates and refuters. First interpreter by @ErikHambardzumyan visualizes
how the distribution of confounder changes - Friendlier error messages for propensity score stratification estimator when there is not enough data in a bin.
- Enhancements to the dummy outcome refuter with machine learned components--now can simulate non-zero effects too. Ready for alpha testing
Docs
- New case studies using DoWhy on hotel booking cancellations and membership rewards programs.
- New notebook on using DoWhy+EconML for estimating effect of multiple treatments
- A tutorial on causal inference using dowhy and econml
- Better organization of docs and notebooks on the documentation website (https://microsoft.github.io/dowhy/)
Community
- Created a contributors page with guidelines for contributing
- Added allcontributors bot so that new contributors can added just after their pull requests are merged
A big thanks to @Tanmay-Kulkarni101, @ErikHambardzumyan, @Sid-darthvader for their contributions.
Powerful refutations and better support for heterogeneous treatment effects
-
DummyOutcomeRefuter now includes machine learning functions to increase power of the refutation.
- In addition to generating a random dummy outcome, now you can generate a dummyOutcome that is an arbitrary function of confounders but always independent of treatment, and then test whether the estimated treatment effect is zero. This is inspired by ideas from the T-learner.
- We also provide default machine learning-based methods to estimate such a dummyOutcome based on confounders. Of course, you can specify any custom ML method.
-
Added a new BootstrapRefuter that simulates the issue of measurement error with confounders. Rather than a simple bootstrap, you can generate bootstrap samples with noise on the values of the confounders and check how sensitive the estimate is.
- The refuter supports custom selection of the confounders to add noise to.
-
All refuters now provide confidence intervals and a significance value.
-
Better support for heterogeneous effect libraries like EconML and CausalML
- All CausalML methods can be called directly from DoWhy, in addition to all methods from EconML.
- [Change to naming scheme for estimators] To achieve a consistent naming scheme for estimators, we suggest to prepend internal dowhy estimators with the string "dowhy". For example, "backdoor.dowhy.propensity_score_matching". Not a breaking change, so you can keep using the old naming scheme too.
- EconML-specific: Since EconML assumes that effect modifiers are a subset of confounders, a warning is issued if a user specifies effect modifiers outside of confounders and tries to use EconML methods.
-
CI and Standard errors: Added bootstrap-based confidence intervals and standard errors for all methods. For linear regression estimator, also implemented the corresponding parametric forms.
-
Convenience functions for getting confidence intervals, standard errors and conditional treatment effects (CATE), that can be called after fitting the estimator if needed
-
Better coverage for tests. Also, tests are now seeded with a random seed, so more dependable tests.
Thanks to @Tanmay-Kulkarni101 and @Arshiaarya for their contributions!
CATE estimation and integration with EconML
This release includes many major updates:
- (BREAKING CHANGE) The CausalModel import is now simpler: "from dowhy import CausalModel"
- Multivariate treatments are now supported.
- Conditional Average Treatment Effects (CATE) can be estimated for any subset of the data. Includes integration with EconML--any method from EconML can be called using DoWhy through the estimate_effect method (see example notebook).
- Other than CATE, specific target estimands like ATT and ATC are also supported for many of the estimation methods.
- For reproducibility, you can specify a random seed for all refutation methods.
- Multiple bug fixes and updates to the documentation.
Includes contributions from @j-chou, @ktmud, @jrfiedler, @shounak112358, @Lnk2past. Thank you all!
First release
This release implements the four steps of causal inference: model, identify, estimate and refute. It also includes a pandas.DataFrame extension for causal inference and the do-sampler.