Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using p(x,n) in the formula fails. #444

Closed
hadjipantelis opened this issue Jan 21, 2022 · 4 comments
Closed

Using p(x,n) in the formula fails. #444

hadjipantelis opened this issue Jan 21, 2022 · 4 comments

Comments

@hadjipantelis
Copy link

Hello and thank you for your work in bambi, it is great.

I noticed that when a variable p exists in the workspace, bambi parsing fails if it needs to also use the p(x, n) function for the response term. Model instantiation will try to use the variable p already in the workpace. Please see a minimal example below.

import bambi as bmb 
import pandas as pd
import numpy as np
from numpy.random import default_rng
rng = default_rng(321)

N = 1000
n = 30 
x = np.random.uniform(size=N, low=-0.4, high=0.4) 
p = 0.4 + 0.1*x 
y = np.random.binomial(n=n,p=p)
data = pd.DataFrame({'n':n, 'y':y, 'x':x}) 
# del p # Uncomment to make the error go away.

model_1 = bmb.Model("p(y,n) ~ x", data, family="binomial")

The full error is:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_74421/1647419273.py in <module>
----> 1 model_153 = bmb.Model("p(y,n) ~ x", data, family="binomial")
      2 model_153.build()

~/.local/lib/python3.8/site-packages/bambi/models.py in __init__(self, formula, data, family, priors, link, categorical, potentials, dropna, auto_scale, automatic_priors, noncentered, priors_cor, taylor)
    160         na_action = "drop" if dropna else "error"
    161         self.formula = formula
--> 162         self._design = design_matrices(formula, data, na_action, env=1)
    163 
    164         if self._design.response is None:

~/.local/lib/python3.8/site-packages/formulae/matrices.py in design_matrices(formula, data, na_action, env)
    588             raise ValueError(f"'data' contains {incomplete_rows_n} incomplete rows.")
    589 
--> 590     design = DesignMatrices(description, data, env)
    591     return design
    592 

~/.local/lib/python3.8/site-packages/formulae/matrices.py in __init__(self, model, data, env)
     57         if self.model.response:
     58             self.response = ResponseVector(self.model.response)
---> 59             self.response._evaluate(data, env)
     60 
     61         if self.model.common_terms:

~/.local/lib/python3.8/site-packages/formulae/matrices.py in _evaluate(self, data, env)
    111         self.data = data
    112         self.env = env
--> 113         self.term.set_type(self.data, self.env)
    114         self.term.set_data()
    115         self.name = self.term.term.name

~/.local/lib/python3.8/site-packages/formulae/terms/terms.py in set_type(self, data, env)
    823     def set_type(self, data, env):
    824         """Set type of the response term."""
--> 825         self.term.set_type(data, env)
    826 
    827     def set_data(self, encoding=False):

~/.local/lib/python3.8/site-packages/formulae/terms/terms.py in set_type(self, data, env)
    435                 component.set_type(data)
    436             elif isinstance(component, Call):
--> 437                 component.set_type(data, env)
    438             else:
    439                 raise ValueError(

~/.local/lib/python3.8/site-packages/formulae/terms/call.py in set_type(self, data_mask, env)
     96 
     97         self.env = env.with_outer_namespace(TRANSFORMS)
---> 98         x = self.call.eval(data_mask, self.env)
     99 
    100         if is_numeric_dtype(x):

~/.local/lib/python3.8/site-packages/formulae/terms/call_resolver.py in eval(self, data_mask, env)
    266         kwargs = {name: arg.eval(data_mask, env) for name, arg in self.kwargs.items()}
    267 
--> 268         return callee(*args, **kwargs)
    269 
    270 

TypeError: 'numpy.ndarray' object is not callable

I am using the latest bambi/formulae.

from importlib.metadata import version
version('numpy'), version('pandas'), version('bambi'), version('formulae')
# ('1.20.3', '1.3.4', '0.7.1', '0.2.0')

Again, thank you for your work on bambi. This bug has a relatively easy work-around so it is not a show-stopper but I guess it would be better if it didn't exist. 😄

PS: You might want to invest in having a minimal issues template for your git-repo, helps with the structure, makes it clear what information is needed, etc.

@tomicapretto
Copy link
Collaborator

Hi @hadjipantelis

Thanks for reporting the problem and also all the suggestions. This is not a problem with Bambi itself, but a problem with the formula parsing library, which we develop too.

formulae has a bunch of built in functions that aim to simplify how you transform the data. Right now, when you call something, it first looks in the scope where the model is being constructed. If there's something with that name in there, it uses that thing. This is what is happening in your example.

This behaviour allows you to override builtin functions. For example, formulae has a scale() function, that you can override if you write your own scale() function. If we force you to always use the builtin versions in formulae, then you lose this feature.

I think a nice fix would be to raise a warning when there's such a name conflict, but still use the builtin function. That would guide you to write a function with a name that does not conflict with the name of the builtin function in formulae.

I'll try to fix this issue for the next release.

@hadjipantelis
Copy link
Author

@tomicapretto Seems like a reasonable thing to do. I can see it was a decision choice (up to a certain extent) but yeah, a warning message will likely be helpful. (I suspected as such about formulae and that's why I reported its version too.)
Thank you for the clarification. Feel free to close this issue at your convenience.

@tomicapretto
Copy link
Collaborator

Let's keep this open until we have a fix. It may be helpful if someone else has the same problem.

@tomicapretto
Copy link
Collaborator

Fixed in bambinos/formulae#109 and available in formulae >= 0.5.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants