Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pymc.math: unknown attribute #6406

Closed
ibobak opened this issue Sep 17, 2024 · 4 comments
Closed

pymc.math: unknown attribute #6406

ibobak opened this issue Sep 17, 2024 · 4 comments
Assignees
Labels
needs repro Issue has not been reproduced yet

Comments

@ibobak
Copy link

ibobak commented Sep 17, 2024

Type: Bug

I am using pymc library - a well known, famous and respectful one.
Here is the problem:

image

Environment file (use conda to recreate this environment):
e.yml.txt

Code to reproduce:

import pymc as pm
import numpy as np

def estimate_mu_sigma(data):
    log_data = np.log(data)  # Take the natural logarithm of the data
    mu_estimate = np.mean(log_data)  # estimate mu (mean of log-transformed data)
    sigma_estimate = np.std(log_data, ddof=0)  # Estimate sigma (standard deviation of log-transformed data), ddof=0 for population std
    return mu_estimate, sigma_estimate

def get_values(a_pdf, a_column, a_quantile: float = 0.0) -> np.array:
    v = a_pdf[a_column].values
    if a_quantile > 0:
        v_len = len(v)
        q_value = np.quantile(v, a_quantile)
        print(f"v0 max={np.max(v)}, quantile {a_quantile}={q_value}")
        v = a_pdf[a_pdf[a_column]<q_value][a_column].values
        print(f"Previous length={v_len}, current length={len(v)}")
    return v


def train_pymc(a_pdf: pd.DataFrame, a_ad_type: str, a_cohort_day0: int, a_cohort_day1: int):
    pdf_d0: pd.DataFrame = a_pdf[(a_pdf["v_ad_type"]==a_ad_type) & (a_pdf["cohort_day"]==a_cohort_day0)]
    pdf_d1: pd.DataFrame = a_pdf[(a_pdf["v_ad_type"]==a_ad_type) & (a_pdf["cohort_day"]==a_cohort_day1)]
    
    v0 = get_values(pdf_d0, "v_cpm", 0.97)
    est_mu_0, est_sigma_0 = estimate_mu_sigma(v0)
    print(f"Estimated est_mu_0={est_mu_0}, est_sigma_0={est_sigma_0}")
    
    v1 = get_values(pdf_d1, "v_cpm", 0.97)
    est_mu_1, est_sigma_1 = estimate_mu_sigma(v1)
    print(f"Estimated est_mu_1={est_mu_1}, est_sigma_1={est_sigma_1}")
    
    with pm.Model():
        mu_0 = pm.Uniform("mu_0", 0.25*est_mu_0, 2*est_mu_0)                 # TODO: too broad ranges, try to minimize by confidence intervasl
        sigma_0 = pm.Uniform("sigma_0", 0.25*est_sigma_0, 2*est_sigma_0)
        pm.LogNormal('x0', mu=mu_0, sigma=sigma_0, observed=v0)
        
        mu_1 = pm.Uniform("mu_1", 0.25*est_mu_1, 2*est_mu_1)                 # TODO: too broad ranges, try to minimize by confidence intervasl
        sigma_1 = pm.Uniform("sigma_1", 0.25*est_sigma_1, 2*est_sigma_1)
        pm.LogNormal('x1', mu=mu_1, sigma=sigma_1, observed=v1)
        
        pm.Deterministic("delta_mean", pm.math.exp(mu_1 + sigma_1**2 / 2) - pm.math.exp(mu_0 + sigma_0**2 / 2))
        
        # To be explained in chapter 3.
        step = pm.NUTS()
        trace = pm.sample(20000, tune=1000, step=step, chains=4)
    
    return trace

trace = train_pymc(pdf, "REWARDED", 0, 1)

Extension version: 2024.9.1
VS Code version: Code 1.93.1 (38c31bc77e0dd6ae88a4e9cc93428cc27a56ba40, 2024-09-11T17:20:05.685Z)
OS version: Linux x64 6.5.0-45-generic
Modes:

System Info
Item Value
CPUs Intel(R) Xeon(R) CPU E5-2696 v4 @ 2.20GHz (88 x 1359)
GPU Status 2d_canvas: enabled
canvas_oop_rasterization: enabled_on
direct_rendering_display_compositor: disabled_off_ok
gpu_compositing: enabled
multiple_raster_threads: enabled_on
opengl: enabled_on
rasterization: enabled
raw_draw: disabled_off_ok
skia_graphite: disabled_off
video_decode: enabled
video_encode: disabled_software
vulkan: disabled_off
webgl: enabled
webgl2: enabled
webgpu: disabled_off
webnn: disabled_off
Load (avg) 1, 2, 2
Memory (System) 251.76GB (222.50GB free)
Process Argv --crash-reporter-id 27d42247-63fb-4d9e-9cb0-87d9974843dc
Screen Reader no
VM 0%
DESKTOP_SESSION ubuntu-xorg
XDG_CURRENT_DESKTOP Unity
XDG_SESSION_DESKTOP ubuntu-xorg
XDG_SESSION_TYPE x11
A/B Experiments
vsliv368:30146709
vspor879:30202332
vspor708:30202333
vspor363:30204092
vscod805:30301674
binariesv615:30325510
vsaa593:30376534
py29gd2263:31024239
c4g48928:30535728
azure-dev_surveyone:30548225
2i9eh265:30646982
962ge761:30959799
pythongtdpath:30769146
welcomedialog:30910333
pythonnoceb:30805159
asynctok:30898717
pythonmypyd1:30879173
h48ei257:31000450
pythontbext0:30879054
accentitlementst:30995554
dsvsc016:30899300
dsvsc017:30899301
dsvsc018:30899302
cppperfnew:31000557
dsvsc020:30976470
pythonait:31006305
dsvsc021:30996838
9c06g630:31013171
a69g1124:31058053
dvdeprecation:31068756
dwnewjupyter:31046869
2f103344:31071589
impr_priority:31102340
nativerepl2:31139839
refactort:31108082
pythonrstrctxt:31112756
flightc:31134773
wkspc-onlycs-t:31132770
wkspc-ranged-t:31125599
fje88620:31121564

@github-actions github-actions bot added the needs repro Issue has not been reproduced yet label Sep 17, 2024
@rchiodo
Copy link
Contributor

rchiodo commented Sep 17, 2024

Thanks for the issue. This is occurring because you're not importing pymc.math. You have to import that module for this to work. The fact that it runs is just a side effect of what the pymc module is doing. (It imports pymc.math itself)

See this issue for more information (similar situation)
#4326

And this documentation:
https://microsoft.github.io/pyright/#/import-statements

@rchiodo rchiodo closed this as completed Sep 17, 2024
@ibobak
Copy link
Author

ibobak commented Sep 20, 2024

This is not all.

import pymc
import pymc.math

I did this, and here is the code:

with pymc.Model() as model:
        mu_0 = pymc.Uniform("mu_0", d0["mu_ci_lower_0"], d0["mu_ci_upper_0"],
                            initval=d0["mu_estimate_0"])
        sigma_0 = pymc.Uniform("sigma_0", d0["sigma_ci_lower_0"], d0["sigma_ci_upper_0"],
                               initval=d0["sigma_estimate_0"])
        pymc.LogNormal('x0', mu=mu_0, sigma=sigma_0, observed=v0)

        mu_1 = pymc.Uniform("mu_1", d1["mu_ci_lower_1"], d1["mu_ci_upper_1"],
                            initval=d1["mu_estimate_1"])
        sigma_1 = pymc.Uniform("sigma_1", d1["sigma_ci_lower_1"], d1["sigma_ci_upper_1"],
                               initval=d1["sigma_estimate_1"])
        pymc.LogNormal('x1', mu=mu_1, sigma=sigma_1, observed=v1)

        pymc.Deterministic("delta_mean", pymc.math.exp(mu_0 + sigma_0**2 / 2) - pymc.math.exp(mu_1 + sigma_1**2 / 2))
        pymc.Deterministic("delta_mean_percent",
                           100 * (pymc.math.exp(mu_0 + sigma_0**2 / 2)
                            - pymc.math.exp(mu_1 + sigma_1**2 / 2)) / pymc.math.exp(mu_1 + sigma_1**2 / 2))

And this is what I am getting:
image

@ibobak
Copy link
Author

ibobak commented Sep 20, 2024

@rchiodo could you please re-open and look at this error?

@rchiodo
Copy link
Contributor

rchiodo commented Sep 23, 2024

That would likely be because pymc.math.exp is not returning anything.

Yeah the definition looks like this:

@scalar_elemwise
def exp(a):
    """e^`a`""" 

There's actually no code there.

I'm guessing pymc is a python wrapper around some C code? It would need to specify return types for this to work correctly.

Or you can turn off type checking in your code. That error won't show up if 'typeCheckingMode' is off, or if you put # type: ignore on the line with the error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs repro Issue has not been reproduced yet
Projects
None yet
Development

No branches or pull requests

3 participants