Creating Causal Identification module #1166

cetagostini · 2024-11-04T23:34:53Z

Description

Short description: Integration of CausalGraphModel in BaseMMM Class

This update integrates a CausalGraphModel into the BaseMMM class, allowing for automated causal identification based on backdoor criteria, assuming a given Directed Acyclic Graph (DAG).

Summary of Changes

Added Causal Graph Option:
- The BaseMMM class now accepts an optional dag parameter, which can be provided either as a string (DOT format) or a networkx.DiGraph.
- If dag is provided, a CausalGraphModel is instantiated to analyze causal relationships and determine necessary adjustment sets.
Automatic Minimal Adjustment Set Handling:
- The BaseMMM initialization now includes logic to calculate the minimal adjustment set required to estimate the causal effect of the treatment variables (assume to be media channels) on the outcome.
- control_columns are automatically updated to include variables from the minimal adjustment set only.
- If the variable yearly_seasonality is not in the minimal adjustment set, the yearly_seasonality parameter is set to None, effectively disabling it in the model.
Warnings for Missing Adjustment Sets:
- If a minimal adjustment set cannot be identified, a warning is issued, and not modifications are made during the initialization.

Code Example

Here's how to initialize BaseMMM with a DAG for causal inference:

dag_str = """
digraph {
    x1 -> y;
    x2 -> y;
    yearly_seasonality -> y;
    event_1 -> y;
    event_2 -> y;
}
"""

mmm = MMM(
    model_config=my_model_config,
    sampler_config=my_sampler_config,
    date_column="date_week",
    adstock=GeometricAdstock(l_max=8),
    saturation=LogisticSaturation(),
    channel_columns=["x1", "x2"],
    control_columns=["event_1", "event_2"],
    yearly_seasonality=2,  # Disabled if 'yearly_seasonality' is not in minimal adjustment set
    dag=dag_str,
    outcome_column="y",
)

Related Issue

Closes #
Related to #

Checklist

Checked that the pre-commit linting/style checks pass
Included tests that prove the fix is effective or that the new feature works
Added necessary documentation (docstrings and/or example notebooks)
If you are a pro: each commit corresponds to a relevant logical change

Modules affected

MMM
CLV

Type of change

📚 Documentation preview 📚: https://pymc-marketing--1166.org.readthedocs.build/en/1166/

wd60622 · 2024-11-04T23:42:10Z

What is z in the 2nd body example? Would that be in the model?

pymc_marketing/mmm/causal.py

cetagostini · 2024-11-04T23:58:00Z

What is z in the 2nd body example? Would that be in the model?

Old example, I did the correction!

review-notebook-app · 2024-11-13T18:45:03Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

codecov · 2024-11-13T19:09:06Z

Codecov Report

Attention: Patch coverage is 89.09091% with 6 lines in your changes missing coverage. Please review.

Project coverage is 95.27%. Comparing base (39028fe) to head (fb9993e).

Files with missing lines	Patch %	Lines
pymc_marketing/mmm/causal.py	84.61%	6 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1166      +/-   ##
==========================================
- Coverage   95.34%   95.27%   -0.07%     
==========================================
  Files          47       48       +1     
  Lines        4963     5018      +55     
==========================================
+ Hits         4732     4781      +49     
- Misses        231      237       +6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

cetagostini · 2024-12-16T18:28:15Z

Something blocking this merge? @juanitorduz @wd60622

wd60622

We will need some mmm tests as well

pymc_marketing/model_builder.py

…pymc-marketing into causal_identification

cetagostini · 2024-12-16T23:15:08Z

Current error its not a code issue, its on the docs/readthedocs.org:pymc-marketing:

writing output... [ 75%] api/generated/pymc_marketing.mmm.mmm.BaseMMM.get_errors
Command killed due to timeout or excessive memory consumption

juanitorduz · 2024-12-19T07:38:23Z

Something blocking this merge? @juanitorduz @wd60622

I will take another look next week 🙏 . In the meantime I agree we need to add more tests :)

cetagostini · 2024-12-24T17:32:23Z

@juanitorduz @wd60622

Hey guys I added a few test to the causal model initialization in the MMM class. Everything should be to merge, main behavior its on the test_causal.py and the initialization its under test_mmm.py

…pymc-marketing into causal_identification

juanitorduz · 2025-01-02T09:29:38Z

I will take a detailed review this week now that we finally pushed the awaiting customer choice model 🙏

juanitorduz

Hey @carlosagostini ! Sorry for the late review #shameonme

I left some comments. One important observation is the missing opportunities on variance reduction with non-minima sets (please see comment below). There are other small comments and suggested changes regarding some minor code modularization.

juanitorduz · 2025-01-02T19:19:44Z

pymc_marketing/mmm/causal.py

+    treatment : list[str]
+        A list of treatment variable names.
+    outcome : str
+        The outcome variable name.


Can you please add https://github.com/py-why/dowhy a a Reference in the class description?

juanitorduz · 2025-01-02T19:22:36Z

pymc_marketing/mmm/causal.py

+        )
+
+    def get_unique_adjustment_nodes(self) -> list[str]:
+        """Compute the minimal adjustment set required for backdoor adjustment across all treatments.


Can you expand more on the meaning of the minimal set (think about new comers)? Can you also also add references? I suggest

Causal Inference in Statistics
A Primer
By Judea Pearl, Madelyn Glymour, Nicholas P. Jewell · 2016

juanitorduz · 2025-01-02T19:26:54Z

pymc_marketing/mmm/causal.py

+    Provides methods to analyze causal relationships and determine the minimal adjustment set
+    for backdoor adjustment between treatment and outcome variables.


Sometimes, external regressors are not in the minimal set but help decreasing variance; see https://matheusfacure.github.io/python-causality-handbook/07-Beyond-Confounders.html#good-controls

Concretely:

Anytime we have a control that is a good predictor of the outcome, even if it is not a confounder, adding it to our model is a good idea.

So I am hesitant to remove, for example seasonality, if it is not in the minimal set. WDYT?

juanitorduz · 2025-01-02T19:30:13Z

pymc_marketing/mmm/causal.py

+        if unique_controls:
+            warnings.warn(
+                f"Columns {unique_controls} are not in the adjustment set. Controls are being modified.",
+                stacklevel=2,
+            )
+
+        control_columns = list(common_controls - set(channel_columns))
+
+        self.minimal_adjustment_set = control_columns + list(channel_columns)


I am hesitant on this step because my comment on variance reduction above. Maybe we can have an additional parameter, something like minimal or maximal set . WDYD?

juanitorduz · 2025-01-02T19:31:56Z

pymc_marketing/mmm/mmm.py

+        # Initialize causal graph if provided
+        if self.dag is not None and self.outcome_node is not None:
+            if self.treatment_nodes is None:
+                self.treatment_nodes = self.channel_columns
+                warnings.warn(
+                    "No treatment nodes provided, using channel columns as treatment nodes.",
+                    stacklevel=2,
+                )
+            self.causal_graphical_model = CausalGraphModel.build_graphical_model(
+                graph=self.dag,
+                treatment=self.treatment_nodes,
+                outcome=self.outcome_node,
+            )
+
+            self.control_columns = self.causal_graphical_model.compute_adjustment_sets(
+                control_columns=self.control_columns,
+                channel_columns=self.channel_columns,
+            )
+
+            if "yearly_seasonality" not in self.causal_graphical_model.adjustment_set:
+                warnings.warn(
+                    "Yearly seasonality excluded as it's not required for adjustment.",
+                    stacklevel=2,
+                )


This should be split into at least one or two functions and just called at initialization (+ unit test for each function)

juanitorduz · 2025-01-02T19:36:25Z

Add subtitle like: business problem

View entire conversation on ReviewNB

juanitorduz · 2025-01-02T19:38:45Z

Shall we remove the first data points which are generated by the natural fact that we can not adstock much for the initial point ?

View entire conversation on ReviewNB

juanitorduz · 2025-01-02T19:39:04Z

+1

View entire conversation on ReviewNB

review-notebook-app · 2025-01-02T19:39:44Z

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2025-01-02T19:39:43Z
----------------------------------------------------------------

Observe that the "over control" can reduce variance on the estimation, see https://matheusfacure.github.io/python-causality-handbook/07-Beyond-Confounders.html#good-controls

Anytime we have a control that is a good predictor of the outcome, even if it is not a confounder, adding it to our model is a good idea. It helps lowering the variance of our treatment effect estimates.

review-notebook-app · 2025-01-02T19:39:45Z

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2025-01-02T19:39:44Z
----------------------------------------------------------------

Line #11.    sns.lineplot(x="date_week", y="competitor_offers", data=df, color="C3", ax=ax);

Add title

review-notebook-app · 2025-01-02T19:39:45Z

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2025-01-02T19:39:45Z
----------------------------------------------------------------

Latex error? $x_1$ or $x_{1t}$

juanitorduz · 2025-01-02T19:40:47Z

I think the previous comments on the notebooks have not been addressed yet ;)

Creating Causal Identification module

18ccc61

github-actions bot added the MMM label Nov 4, 2024

cetagostini requested review from wd60622 and juanitorduz and removed request for wd60622 November 4, 2024 23:35

wd60622 added causal inference enhancement New feature or request labels Nov 4, 2024

wd60622 reviewed Nov 4, 2024

View reviewed changes

pymc_marketing/mmm/causal.py Outdated Show resolved Hide resolved

wd60622 reviewed Nov 4, 2024

View reviewed changes

pymc_marketing/mmm/causal.py Outdated Show resolved Hide resolved

wd60622 reviewed Nov 4, 2024

View reviewed changes

pymc_marketing/mmm/causal.py Outdated Show resolved Hide resolved

Pre-commit

6a08373

wd60622 and others added 5 commits November 5, 2024 22:24

Merge branch 'main' into causal_identification

9f4af46

adding missing libraries

a53b7d7

Merge branch 'main' into causal_identification

c45e5c1

Merge branch 'main' into causal_identification

8c26976

Pushing for push

7b09ef6

github-actions bot added the docs Improvements or additions to documentation label Nov 13, 2024

Another random push

8d51555

cetagostini marked this pull request as draft November 15, 2024 16:44

Final v1 push

171bd10

github-actions bot added the tests label Nov 16, 2024

cetagostini requested a review from wd60622 November 16, 2024 22:22

cetagostini marked this pull request as ready for review November 16, 2024 22:22

cetagostini added 2 commits November 17, 2024 00:23

Merge branch 'main' into causal_identification

4f281a7

Adding pre-commit

a77d871

Merge branch 'main' into causal_identification

bc6d2ba

wd60622 requested changes Dec 16, 2024

View reviewed changes

pymc_marketing/model_builder.py Outdated Show resolved Hide resolved

cetagostini added 3 commits December 16, 2024 20:49

Notebook adjustments

f308243

Merge branch 'causal_identification' of https://github.com/pymc-labs/…

6858494

…pymc-marketing into causal_identification

Remove model builder needs

f6dc7cf

wd60622 mentioned this pull request Dec 16, 2024

docs/readthedocs.org:pymc-marketing memory issue #1286

Open

Merge branch 'main' into causal_identification

891d402

cetagostini added 2 commits December 24, 2024 19:30

Creating test for causal module

95eac12

Merge branch 'main' into causal_identification

8ca5777

cetagostini requested a review from wd60622 December 24, 2024 17:32

cetagostini and others added 4 commits December 24, 2024 19:45

Updating notebook.

9f997e2

Merge branch 'causal_identification' of https://github.com/pymc-labs/…

2ccd75c

…pymc-marketing into causal_identification

Merge branch 'main' into causal_identification

4b8c122

Merge branch 'main' into causal_identification

f1d12f7

wd60622 added this to the 0.11.0 milestone Dec 29, 2024

cetagostini and others added 2 commits December 30, 2024 22:09

Merge branch 'main' into causal_identification

a70d340

Merge branch 'main' into causal_identification

fb9993e

juanitorduz requested changes Jan 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creating Causal Identification module #1166

Creating Causal Identification module #1166

cetagostini commented Nov 4, 2024 •

edited

Loading

wd60622 commented Nov 4, 2024

cetagostini commented Nov 4, 2024

review-notebook-app bot commented Nov 13, 2024

codecov bot commented Nov 13, 2024 •

edited

Loading

cetagostini commented Dec 16, 2024

wd60622 left a comment

cetagostini commented Dec 16, 2024 •

edited

Loading

juanitorduz commented Dec 19, 2024

cetagostini commented Dec 24, 2024

juanitorduz commented Jan 2, 2025

juanitorduz left a comment

juanitorduz Jan 2, 2025

juanitorduz Jan 2, 2025

juanitorduz Jan 2, 2025

juanitorduz Jan 2, 2025

juanitorduz Jan 2, 2025

juanitorduz commented Jan 2, 2025

juanitorduz commented Jan 2, 2025

juanitorduz commented Jan 2, 2025

review-notebook-app bot commented Jan 2, 2025 •

edited

Loading

review-notebook-app bot commented Jan 2, 2025 •

edited

Loading

review-notebook-app bot commented Jan 2, 2025 •

edited

Loading

juanitorduz commented Jan 2, 2025

		Provides methods to analyze causal relationships and determine the minimal adjustment set
		for backdoor adjustment between treatment and outcome variables.

Creating Causal Identification module #1166

Are you sure you want to change the base?

Creating Causal Identification module #1166

Conversation

cetagostini commented Nov 4, 2024 • edited Loading

Description

Summary of Changes

Code Example

Related Issue

Checklist

Modules affected

Type of change

wd60622 commented Nov 4, 2024

cetagostini commented Nov 4, 2024

review-notebook-app bot commented Nov 13, 2024

codecov bot commented Nov 13, 2024 • edited Loading

Codecov Report

cetagostini commented Dec 16, 2024

wd60622 left a comment

Choose a reason for hiding this comment

cetagostini commented Dec 16, 2024 • edited Loading

juanitorduz commented Dec 19, 2024

cetagostini commented Dec 24, 2024

juanitorduz commented Jan 2, 2025

juanitorduz left a comment

Choose a reason for hiding this comment

juanitorduz Jan 2, 2025

Choose a reason for hiding this comment

juanitorduz Jan 2, 2025

Choose a reason for hiding this comment

juanitorduz Jan 2, 2025

Choose a reason for hiding this comment

juanitorduz Jan 2, 2025

Choose a reason for hiding this comment

juanitorduz Jan 2, 2025

Choose a reason for hiding this comment

juanitorduz commented Jan 2, 2025

juanitorduz commented Jan 2, 2025

juanitorduz commented Jan 2, 2025

review-notebook-app bot commented Jan 2, 2025 • edited Loading

review-notebook-app bot commented Jan 2, 2025 • edited Loading

review-notebook-app bot commented Jan 2, 2025 • edited Loading

juanitorduz commented Jan 2, 2025

cetagostini commented Nov 4, 2024 •

edited

Loading

codecov bot commented Nov 13, 2024 •

edited

Loading

cetagostini commented Dec 16, 2024 •

edited

Loading

review-notebook-app bot commented Jan 2, 2025 •

edited

Loading

review-notebook-app bot commented Jan 2, 2025 •

edited

Loading

review-notebook-app bot commented Jan 2, 2025 •

edited

Loading