-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Source-wise interpolation #46
Conversation
Ok it seems like I broke something -- very good to have unit tests! I'll fix it tomorrow. |
The reason for the now failing tests is that I moved the |
So far, I mainly focused on the time and memory the likelihood needs to be initialized. I now noticed that running a fit with the source-wise interpolation is unfortunately even slightly slower than the previous implementation. The way I wrote the |
After some more profiling I now think that the likelihood calls are not the bottleneck (I also couldn't find a way to speed this up in a meaningful way). Instead, I suspect minuit just doesn't love having many of these shape parameters, which make the likelihood unsmooth. |
blueice/likelihood.py
Outdated
for sn, shape_parameters in self.source_shape_parameters.items(): | ||
self.source_morphers[sn] = MORPHERS[self.config['morpher']]( | ||
self.config.get('morpher_config', {}), | ||
shape_parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
almost certainly overkill, but for readability, perhaps we can make a tiny minimorpher = lambda shape_parameters: MORPHERS[self.config['morpher']]( self.config.get('morpher_config', {}), shape_parameters)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea -- done! (though I made a proper def to make flake8 happy 😊
@kdund, @JelleAalbers I know you're both very busy. We would need to resolve the missing unittest but apart from this, I think this could be merged. Should I try to find someone else to have a look at the changes to the code so that we can proceed? |
Hi Robert, for me it would take quite a bit of time to import blueice in my head again, and I can't promise to do it soon. If you know of someone else who is currently using blueice I'm happy to give them access (if they don't have it already). |
Hi, both!
Sorry, Robert, I started but then was very lax in finishing it…
Can I look at it tomorrow?
Knut
… On 7 Nov 2024, at 10:43, Jelle Aalbers ***@***.***> wrote:
Hi Robert, for me it would take quite a bit of time to import blueice in my head again, and I can't promise to do it soon. If you know of someone else who is currently using blueice I'm happy to give them access (if they don't have it already).
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Thanks a lot @kdund , that would be perfect! 😊 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently failing with
============================================ FAILURES =============================================
___________________________________________ test_rates ____________________________________________
def test_rates():
m = Model(conf_for_test(n_sources=1))
np.testing.assert_array_equal(m.expected_events(), np.array([1000]))
m.config['livetime_days'] = 2
> np.testing.assert_array_equal(m.expected_events(), np.array([2000]))
tests/test_model.py:9:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
args = (<built-in function eq>, array([1000.]), array([2000]))
kwds = {'err_msg': '', 'header': 'Arrays are not equal', 'strict': False, 'verbose': True}
@wraps(func)
def inner(*args, **kwds):
with self._recreate_cm():
> return func(*args, **kwds)
E AssertionError:
E Arrays are not equal
E
E Mismatched elements: 1 / 1 (100%)
E Max absolute difference: 1000.
E Max relative difference: 0.5
E x: array([1000.])
E y: array([2000])
../../../opt/miniconda3/envs/myenv/lib/python3.8/contextlib.py:75: AssertionError
======================================== warnings summary =========================================
tests/test_inference.py::test_fit_scipy
tests/test_inference.py::test_fit_scipy
tests/test_inference.py::test_fit_scipy
tests/test_inference.py::test_fit_scipy
/Users/kdund/opt/miniconda3/envs/myenv/lib/python3.8/site-packages/scipy/optimize/_numdiff.py:576: RuntimeWarning: invalid value encountered in subtract
df = fun(x) - f0
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
===================================== short test summary info =====================================
FAILED tests/test_model.py::test_rates - AssertionError:
self.n_model_events_interpolator = self.morpher.make_interpolator(f=lambda m: m.pmf_grids()[1], | ||
extra_dims=list(self.ps.shape), | ||
anchor_models=self.anchor_models) | ||
if self.source_wise_interpolation: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For readability, might it be better to just have "if self.source_wise_interpolation: raise error" and then have the rest not in an else block but just normal indentation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes -- I basically wanted to use this as a placeholder if someone is eager to implement this at one point but of course you're right that if the if clause raises an error we don't need the else -- removed! 👍
def _get_shape_indices(self, source_name): | ||
"""Return the indices of the shape parameters used by the source.""" | ||
shape_keys = self.source_shape_parameters[source_name].keys() | ||
return [i for i, k in enumerate(self.shape_parameters.keys()) if k in shape_keys] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for arrays should we return it sorted? Or return a set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I misunderstand your suggestion but I think the return should by this definition be a sorted list, or not? Also entries should be unique.
""" | ||
if s is None: | ||
return np.array([self.expected_events(s) for s in self.sources]) | ||
return s.events_per_day * self.config['livetime_days'] * s.fraction_in_range * s.config['rate_multiplier'] | ||
return s.expected_events |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we loosing lifetime scaling here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answer: yes, @hammannr had told me but I overlooked it-- what changes is that the expected_events is now per-source and not adapted to change on-the-go (you would do that by changing livetime instead)
I approve of that solution.
I now adapted the unittest so that the config of the source is overwritten. This is a bit of an antipattern but I think it's still useful to check that this works. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Robert-- in my opinion this may be merged now
Thanks a lot for the review @kdund! 😊 -- will then merge |
If I have two sources a and b in a model, each with two shape parameters s_a0, s_a1 and s_b0, s_b1, the interpolation of pdfs and expectation values currently happens on a 4D grid (s_a0, s_a1, s_b0, s_b1). This can quickly and unnecessarily blow up the required memory.
This PR adds source-wise interpolation. For this, we keep track of which source listens to which of the shape parameters and can thus interpolate on two 2D grids (s_a0, s_a1) and (s_b0, s_b1).
For now, I added this as an option so the default is unchanged.
So far, everything I tested seemed to work but I'll do more thorough checks in the next days.