Skip to content

Latest commit

 

History

History
189 lines (158 loc) · 8.34 KB

CONTRIBUTING.md

File metadata and controls

189 lines (158 loc) · 8.34 KB

DoubleML - Contributing Guidelines

DoubleML is a community effort. Everyone is welcome to contribute. All contributors should adhere to this contributing guidelines and our code of conduct. The contributing guidelines are particularly helpful to get started for your first contribution.

Submit a Bug Report 🐛

To submit a bug report, you can use our issue template for bug reports.

  • A good bug report contains a minimum reproducible code snippet, like for example
import numpy as np
import doubleml as dml
from doubleml.datasets import make_plr_CCDDHNR2018
from sklearn.ensemble import RandomForestRegressor
from sklearn.base import clone
np.random.seed(3141)
learner = RandomForestRegressor(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2)
ml_g = learner
ml_m = learner
obj_dml_data = make_plr_CCDDHNR2018(alpha=0.5, n_obs=500, dim_x=20)
dml_plr_obj = dml.DoubleMLPLR(obj_dml_data, ml_g, ml_m)
dml_plr_obj.fit().summary
  • State the result you would have expected and the result you actually got. In case of an exception the full traceback is appreciated.

  • State the versions of your code by running the following lines and copy-paste the result.

import platform; print(platform.platform())
import sys; print("Python", sys.version)
import doubleml; print("DoubleML", sklearn.__version__)
import sklearn; print("Scikit-Learn", sklearn.__version__)

Submit a Feature Request 💡

We welcome feature requests and suggestions towards improving and/or extending the DoubleML package. For feature requests you can use the corresponding issue template.

Submit a Question or Start a Discussion

We use GitHub Discussions to give the community a platform for asking questions about the DoubleML package and for discussions on topics related to the package.

Contribute Code 💻

Everyone is welcome to contribute to the DoubleML code base. The following guidelines and hints help you to get started.

Development Workflow

In the following, the recommended way to contribute to DoubleML is described in detail. The most important steps are: To fork the repo, then add your changes and finally submit a pull-request.

  1. Fork the DoubleML repo by clicking on the Fork button (this requires a GitHub account).

  2. Clone your fork to your local machine via

$ git clone git@github.com:YourGitHubAccount/doubleml-for-py.git
$ cd doubleml-for-py
  1. Create a feature branch via
$ git checkout -b my_feature_branch
  1. (Optionally) you can add the upstream remote.
$ git remote add upstream https://github.com/DoubleML/doubleml-for-py.git

This allows you to easily keep your repository in synch via

$ git fetch upstream
$ git merge upstream/main
  1. Install the development dependencies via
$ pip install -r requirements.txt
$ pip install -r requirements-dev.txt
  1. Install DoubleML in editable mode (more details can be found here) via
$ pip install --editable .
  1. Develop your code changes. The changes can be added and pushed via
$ git add your_new_file your_modified_file
$ git commit -m "A commit message which briefly summarizes the changes made"
$ git push origin my_feature_branch
  1. Generate a pull request from your fork. Please follow our guidelines for pull requests. When opening the PR you will be guided with a checklist.

Checklist for Pull Requests (PR)

  • The title of the pull request summarizes the changes made.

  • The PR contains a detailed description of all changes and additions (you may want to comment on the diff in GitHub).

  • References to related issues or PRs are added.

  • The code passes all (unit) tests (see below for details). To check, please run

$ pytest .
  • If you add an enhancements or new feature, unit tests (with a certain level of coverage) are mandatory for getting the PR merged.

  • Check whether your changes adhere to the PEP8 standards. For the check you can use the following code

$ git diff upstream/main -u -- "*.py" | flake8 --diff --max-line-length=127

If your PR is still work in progress, please consider marking it a draft PR (see also here).

Unit Tests and Test Coverage

We use the package pytest for unit testing. Unit testing is considered to be a fundamental part of the development workflow. The tests are located in the tests subfolder. The test coverage is determined with the pytest-cov package. Coverage reports for the package, PRs, branches etc. are available from codecov. It is mandatory to equip new features with an appropriate level of unit test coverage. To run all unit tests (for further option see the pytest docu) call

$ pytest --cov .

If pytest is called with the --cov flag, a unit test coverage report is being generated.

Contribute a New Model Class

The DoubleML package is particularly designed in a flexible way to make it easily extendable with regard to new model classes. Contributions in this direction are very much welcome, and we are happy to help authors to integrate their models in the DoubleML OOP structure. If you need assistance, just open an issue or contact one of the maintainers @MalteKurz or @PhilippBach.

The abstract base class DoubleML implements all core functionalities based on a linear Neyman orthogonal score function. To contribute a new model class, you only need to specify all nuisance functions that need to be estimated for the new model class (e.g. regressions or classifications). Furthermore, the score components for the Neyman orthogonal score function need to be implemented. All other functionality is automatically available via inheritance from the abstract base class. A template for new model classes is available here.

Contribute Documentation 📚

The documentation of DoubleML is generated with sphinx and hosted at https://docs.doubleml.org. The Python API documentation is generated from docstrings in the source code. The source code for the website, user guide, example gallery, etc. is available in a separate repository https://github.com/DoubleML/doubleml-docs.

Contribute to the API Documentation

The API documentation is generated from docstrings in the source code. It can be generated locally (dev requirements sphinx and pydata-sphinx-theme need to be installed) via

$ cd doc/
$ make html

Contribute to the User Guide and Documentation

The documentation of DoubleML is hosted at https://docs.doubleml.org. The source code for the website, user guide, example gallery, etc. is available in a separate repository doubleml-docs. Changes, issues and PRs for the documentation (except the API documentation) should be discussed in the doubleml-docs repo. We welcome contributions to the user guide, especially case studies for the example gallery. A step-by-step guide for contributions to the example gallery is available here.