Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiDimensional Media Mix Model (New PR) #1036

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

cetagostini
Copy link
Contributor

@cetagostini cetagostini commented Sep 13, 2024

Description

Creating an API to support multiple dims.

Related Issue

Checklist

Modules affected

  • MMM
  • CLV

Type of change

  • New feature / enhancement
  • Bug fix
  • Documentation
  • Maintenance
  • Other (please specify):

📚 Documentation preview 📚: https://pymc-marketing--1036.org.readthedocs.build/en/1036/

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link

codecov bot commented Sep 14, 2024

Codecov Report

Attention: Patch coverage is 2.66667% with 219 lines in your changes missing coverage. Please review.

Project coverage is 90.62%. Comparing base (3aeb56e) to head (c14af96).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pymc_marketing/mmm/MultiDimensionalMMM.py 0.00% 163 Missing ⚠️
pymc_marketing/mmm/components/time.py 0.00% 39 Missing ⚠️
pymc_marketing/mmm/components/base.py 26.08% 17 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1036      +/-   ##
==========================================
- Coverage   95.06%   90.62%   -4.45%     
==========================================
  Files          42       44       +2     
  Lines        4456     4681     +225     
==========================================
+ Hits         4236     4242       +6     
- Misses        220      439     +219     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@wd60622 wd60622 added MMM enhancement New feature or request labels Sep 14, 2024
@tim-mcwilliams
Copy link

@cetagostini I've been experimenting with this new feature and came across a potential bug. When trying to use partial pulling across a geo dim, I am running into a broadcasting error.

Looking like its throwing that error when creating the channel_contributions var, specifically in the forward_pass function. Right now the function dims are set to only look at the "channel"

return second.apply(x=first.apply(x=x, dims="channel"), dims="channel")

However, modifying that to include the dims being passed to the VanillaMultiDimensionalMMM class like so

return second.apply(x=first.apply(x=x, dims=(*self.dims,"channel")), dims=(*self.dims,"channel"))

fixes the broadcasting error and I was able to fit the model from there.

Here's the full traceback - @wd60622

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[150], line 1
----> 1 mmm_fit = mmm.fit(
      2     X=region_model_data.drop(columns="units"),
      3     y=region_model_data.drop(columns=Xs)
      4 )

File [~/code/pymc-marketing/docs/source/notebooks/mmm/MultiDimensionalMMM.py:548](http://localhost:8888/lab/tree/docs/source/notebooks/mmm/~/code/pymc-marketing/docs/source/notebooks/mmm/MultiDimensionalMMM.py#line=547), in VanillaMultiDimensionalMMM.fit(self, X, y, progressbar, predictor_names, random_seed, **kwargs)
    545     predictor_names = []
    547 if not hasattr(self, "model"):
--> 548     self.build_model(X, y)
    550 # sampler_kwargs = create_sample_kwargs(
    551 #     self.sampler_config,
    552 #     progressbar,
    553 #     random_seed,
    554 #     **kwargs,
    555 # )
    556 with self.model:

File [~/code/pymc-marketing/docs/source/notebooks/mmm/MultiDimensionalMMM.py:431](http://localhost:8888/lab/tree/docs/source/notebooks/mmm/~/code/pymc-marketing/docs/source/notebooks/mmm/MultiDimensionalMMM.py#line=430), in VanillaMultiDimensionalMMM.build_model(self, X, y, **kwargs)
    426     pass
    428 else:
    429     channel_contributions = pm.Deterministic(
    430         name="channel_contributions",
--> 431         var=self.forward_pass(x=channel_data_),
    432         dims=("date", *self.dims, "channel"),
    433     )
    435 mu_var = intercept + channel_contributions.sum(axis=-1)
    437 if (
    438     self.control_columns is not None
    439     and len(self.control_columns) > 0
    440     and all(column in X.columns for column in self.control_columns)
    441 ):

File [~/code/pymc-marketing/docs/source/notebooks/mmm/MultiDimensionalMMM.py:284](http://localhost:8888/lab/tree/docs/source/notebooks/mmm/~/code/pymc-marketing/docs/source/notebooks/mmm/MultiDimensionalMMM.py#line=283), in VanillaMultiDimensionalMMM.forward_pass(self, x)
    259 """Transform channel input into target contributions of each channel.
    260 
    261 This method handles the ordering of the adstock and saturation
   (...)
    276 
    277 """
    278 first, second = (
    279     (self.adstock, self.saturation)
    280     if self.adstock_first
    281     else (self.saturation, self.adstock)
    282 )
--> 284 return second.apply(x=first.apply(x=x, dims="channel"), dims="channel")

File [/opt/anaconda3/envs/marketing_env/lib/python3.12/site-packages/pymc_marketing/mmm/components/base.py:555](http://localhost:8888/opt/anaconda3/envs/marketing_env/lib/python3.12/site-packages/pymc_marketing/mmm/components/base.py#line=554), in Transformation.apply(self, x, dims)
    522 def apply(self, x: pt.TensorLike, dims: Dims | None = None) -> TensorVariable:
    523     """Call within a model context.
    524 
    525     Used internally of the MMM to apply the transformation to the data.
   (...)
    553 
    554     """
--> 555     kwargs = self._create_distributions(dims=dims)
    556     return self.function(x, **kwargs)

File [/opt/anaconda3/envs/marketing_env/lib/python3.12/site-packages/pymc_marketing/mmm/components/base.py:315](http://localhost:8888/opt/anaconda3/envs/marketing_env/lib/python3.12/site-packages/pymc_marketing/mmm/components/base.py#line=314), in Transformation._create_distributions(self, dims)
    311     var = dist.create_variable(variable_name)
    312     return dim_handler(var, dist.dims)
    314 return {
--> 315     parameter_name: create_variable(parameter_name, variable_name)
    316     for parameter_name, variable_name in self.variable_mapping.items()
    317 }

File [/opt/anaconda3/envs/marketing_env/lib/python3.12/site-packages/pymc_marketing/mmm/components/base.py:312](http://localhost:8888/opt/anaconda3/envs/marketing_env/lib/python3.12/site-packages/pymc_marketing/mmm/components/base.py#line=311), in Transformation._create_distributions.<locals>.create_variable(parameter_name, variable_name)
    310 dist = self.function_priors[parameter_name]
    311 var = dist.create_variable(variable_name)
--> 312 return dim_handler(var, dist.dims)

File [/opt/anaconda3/envs/marketing_env/lib/python3.12/site-packages/pymc_marketing/prior.py:192](http://localhost:8888/opt/anaconda3/envs/marketing_env/lib/python3.12/site-packages/pymc_marketing/prior.py#line=191), in create_dim_handler.<locals>.func(x, dims)
    191 def func(x: pt.TensorLike, dims: Dims) -> pt.TensorVariable:
--> 192     return handle_dims(x, dims, desired_dims)

File [/opt/anaconda3/envs/marketing_env/lib/python3.12/site-packages/pymc_marketing/prior.py:182](http://localhost:8888/opt/anaconda3/envs/marketing_env/lib/python3.12/site-packages/pymc_marketing/prior.py#line=181), in handle_dims(x, dims, desired_dims)
    177 args = [
    178     "x" if missing else idx
    179     for (idx, missing) in zip(new_idx, missing_dims, strict=False)
    180 ]
    181 args = _remove_leading_xs(args)
--> 182 return x.dimshuffle(*args)

File [/opt/anaconda3/envs/marketing_env/lib/python3.12/site-packages/pytensor/tensor/variable.py:347](http://localhost:8888/opt/anaconda3/envs/marketing_env/lib/python3.12/site-packages/pytensor/tensor/variable.py#line=346), in _tensor_py_operators.dimshuffle(self, *pattern)
    345 if (len(pattern) == 1) and (isinstance(pattern[0], list | tuple)):
    346     pattern = pattern[0]
--> 347 op = pt.elemwise.DimShuffle(list(self.type.broadcastable), pattern)
    348 return op(self)

File [/opt/anaconda3/envs/marketing_env/lib/python3.12/site-packages/pytensor/tensor/elemwise.py:171](http://localhost:8888/opt/anaconda3/envs/marketing_env/lib/python3.12/site-packages/pytensor/tensor/elemwise.py#line=170), in DimShuffle.__init__(self, input_broadcastable, new_order)
    168             drop.append(i)
    169         else:
    170             # We cannot drop non-broadcastable dimensions
--> 171             raise ValueError(
    172                 "Cannot drop a non-broadcastable dimension: "
    173                 f"{input_broadcastable}, {new_order}"
    174             )
    176 # This is the list of the original dimensions that we keep
    177 self.shuffle = [x for x in new_order if x != "x"]

ValueError: Cannot drop a non-broadcastable dimension: [False, False], (0,)

@wd60622
Copy link
Contributor

wd60622 commented Oct 22, 2024

Good catch @tim-mcwilliams
So I am hearing you got it working with this fix, right?

@cetagostini You can do checks on the dims in the priors at initialization in order to catch errors earlier if needed

@tim-mcwilliams
Copy link

@wd60622 thanks! Correct, with the fix I was able to get the model working.

"Normal", sigma=Prior("HalfNormal", sigma=2), dims=self.dims
),
"gamma_control": Prior("Normal", mu=0, sigma=2, dims="control"),
"gamma_fourier": Prior("Laplace", mu=0, b=1, dims="fourier_mode"),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding *self.dims to gamma_fourier should enable the modeler to capture seasonality at each dim they specifiy. For example, dims=(*self.dims, "fourier_mode")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, this makes the dims for the media match seasonality. The user can always specify with the model config.

Therefore, I lean away from this

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wd60622 yea, good point there! Best to leave the default config be.

@github-actions github-actions bot added the docs Improvements or additions to documentation label Nov 7, 2024
@luiztauffer
Copy link

hi guys, this looks really good and useful! Thanks for pointing me here, @cetagostini

I've read the example notebook and I have some questions:

  • my impression is that what it is doing is basically training 4 separate models at the same time, is that correct?
  • what would be the difference between this example using VanillaMultiDimensionalMMM and simply running 4 separate MMM?
  • can VanillaMultiDimensionalMMM help making hierarchical assumptions? Let's say, we want the adstock_alpha prior to be modeled by a gaussian that's estimated iteratively for the group country

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Improvements or additions to documentation enhancement New feature or request MMM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants