-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating Causal Identification module #1166
base: main
Are you sure you want to change the base?
Conversation
What is |
Old example, I did the correction! |
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1166 +/- ##
==========================================
- Coverage 95.34% 95.27% -0.07%
==========================================
Files 47 48 +1
Lines 4963 5018 +55
==========================================
+ Hits 4732 4781 +49
- Misses 231 237 +6 ☔ View full report in Codecov by Sentry. |
Something blocking this merge? @juanitorduz @wd60622 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will need some mmm tests as well
Current error its not a code issue, its on the writing output... [ 75%] api/generated/pymc_marketing.mmm.mmm.BaseMMM.get_errors
Command killed due to timeout or excessive memory consumption |
I will take another look next week 🙏 . In the meantime I agree we need to add more tests :) |
Hey guys I added a few test to the causal model initialization in the MMM class. Everything should be to merge, main behavior its on the |
I will take a detailed review this week now that we finally pushed the awaiting customer choice model 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @carlosagostini ! Sorry for the late review #shameonme
I left some comments. One important observation is the missing opportunities on variance reduction with non-minima sets (please see comment below). There are other small comments and suggested changes regarding some minor code modularization.
treatment : list[str] | ||
A list of treatment variable names. | ||
outcome : str | ||
The outcome variable name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add https://github.com/py-why/dowhy a a Reference in the class description?
) | ||
|
||
def get_unique_adjustment_nodes(self) -> list[str]: | ||
"""Compute the minimal adjustment set required for backdoor adjustment across all treatments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you expand more on the meaning of the minimal set (think about new comers)? Can you also also add references? I suggest
Causal Inference in Statistics
A Primer
By Judea Pearl, Madelyn Glymour, Nicholas P. Jewell · 2016
Provides methods to analyze causal relationships and determine the minimal adjustment set | ||
for backdoor adjustment between treatment and outcome variables. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes, external regressors are not in the minimal set but help decreasing variance; see https://matheusfacure.github.io/python-causality-handbook/07-Beyond-Confounders.html#good-controls
Concretely:
Anytime we have a control that is a good predictor of the outcome, even if it is not a confounder, adding it to our model is a good idea.
So I am hesitant to remove, for example seasonality, if it is not in the minimal set. WDYT?
if unique_controls: | ||
warnings.warn( | ||
f"Columns {unique_controls} are not in the adjustment set. Controls are being modified.", | ||
stacklevel=2, | ||
) | ||
|
||
control_columns = list(common_controls - set(channel_columns)) | ||
|
||
self.minimal_adjustment_set = control_columns + list(channel_columns) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am hesitant on this step because my comment on variance reduction above. Maybe we can have an additional parameter, something like minimal
or maximal
set . WDYD?
# Initialize causal graph if provided | ||
if self.dag is not None and self.outcome_node is not None: | ||
if self.treatment_nodes is None: | ||
self.treatment_nodes = self.channel_columns | ||
warnings.warn( | ||
"No treatment nodes provided, using channel columns as treatment nodes.", | ||
stacklevel=2, | ||
) | ||
self.causal_graphical_model = CausalGraphModel.build_graphical_model( | ||
graph=self.dag, | ||
treatment=self.treatment_nodes, | ||
outcome=self.outcome_node, | ||
) | ||
|
||
self.control_columns = self.causal_graphical_model.compute_adjustment_sets( | ||
control_columns=self.control_columns, | ||
channel_columns=self.channel_columns, | ||
) | ||
|
||
if "yearly_seasonality" not in self.causal_graphical_model.adjustment_set: | ||
warnings.warn( | ||
"Yearly seasonality excluded as it's not required for adjustment.", | ||
stacklevel=2, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be split into at least one or two functions and just called at initialization (+ unit test for each function)
Add subtitle like: business problem View entire conversation on ReviewNB |
Shall we remove the first data points which are generated by the natural fact that we can not adstock much for the initial point ? View entire conversation on ReviewNB |
+1 View entire conversation on ReviewNB |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2025-01-02T19:39:43Z Observe that the "over control" can reduce variance on the estimation, see https://matheusfacure.github.io/python-causality-handbook/07-Beyond-Confounders.html#good-controls
Anytime we have a control that is a good predictor of the outcome, even if it is not a confounder, adding it to our model is a good idea. It helps lowering the variance of our treatment effect estimates. |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2025-01-02T19:39:44Z Line #11. sns.lineplot(x="date_week", y="competitor_offers", data=df, color="C3", ax=ax); Add title |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2025-01-02T19:39:45Z Latex error? |
I think the previous comments on the notebooks have not been addressed yet ;) |
Description
Short description: Integration of CausalGraphModel in BaseMMM Class
This update integrates a CausalGraphModel into the BaseMMM class, allowing for automated causal identification based on backdoor criteria, assuming a given Directed Acyclic Graph (DAG).
Summary of Changes
Added Causal Graph Option:
BaseMMM
class now accepts an optionaldag
parameter, which can be provided either as a string (DOT format) or anetworkx.DiGraph
.dag
is provided, aCausalGraphModel
is instantiated to analyze causal relationships and determine necessary adjustment sets.Automatic Minimal Adjustment Set Handling:
BaseMMM
initialization now includes logic to calculate the minimal adjustment set required to estimate the causal effect of the treatment variables (assume to be media channels) on the outcome.control_columns
are automatically updated to include variables from the minimal adjustment set only.yearly_seasonality
is not in the minimal adjustment set, theyearly_seasonality
parameter is set toNone
, effectively disabling it in the model.Warnings for Missing Adjustment Sets:
Code Example
Here's how to initialize
BaseMMM
with a DAG for causal inference:Related Issue
Checklist
Modules affected
Type of change
📚 Documentation preview 📚: https://pymc-marketing--1166.org.readthedocs.build/en/1166/