Source-wise interpolation #46

hammannr · 2024-10-10T18:16:03Z

If I have two sources a and b in a model, each with two shape parameters s_a0, s_a1 and s_b0, s_b1, the interpolation of pdfs and expectation values currently happens on a 4D grid (s_a0, s_a1, s_b0, s_b1). This can quickly and unnecessarily blow up the required memory.

This PR adds source-wise interpolation. For this, we keep track of which source listens to which of the shape parameters and can thus interpolate on two 2D grids (s_a0, s_a1) and (s_b0, s_b1).

For now, I added this as an option so the default is unchanged.
So far, everything I tested seemed to work but I'll do more thorough checks in the next days.

hammannr · 2024-10-10T18:16:54Z

Ok it seems like I broke something -- very good to have unit tests! I'll fix it tomorrow.

hammannr · 2024-10-11T08:52:08Z

The reason for the now failing tests is that I moved the expected_events property to the source, which I think makes sense. There, I compute the number similar to how it was done before but using the source's config entry of livetime_days and rate_multiplier instead of the model's config entries. I think in any case it should not be recommended changing the config values after init. Instead, I'd understand the source.expected_events property as well as the model.expected_events() method as default values -- this is also how it's treated in the likelihood. If you call it with a different value of livetime_days it is simply scaled by this value.
So my suggestion would be to adapt the unittest to match the new definition. Another solution would be to make source.expected_events a method with arguments livetime_days and rate_multiplier -- this way we could leave the current behaviour. What do you think @JelleAalbers @kdund ?

hammannr · 2024-10-16T16:57:17Z

So far, I mainly focused on the time and memory the likelihood needs to be initialized. I now noticed that running a fit with the source-wise interpolation is unfortunately even slightly slower than the previous implementation. The way I wrote the ps_interpolator I should not be too surprised, I think every python developer would punch me for this code. I tried improving it this afternoon but didn't succeed yet. I will continue trying tomorrow but if anyone has a good suggestion that would be highly appreciated! 😊

hammannr · 2024-10-17T13:14:21Z

After some more profiling I now think that the likelihood calls are not the bottleneck (I also couldn't find a way to speed this up in a meaningful way). Instead, I suspect minuit just doesn't love having many of these shape parameters, which make the likelihood unsmooth.

kdund · 2024-10-22T13:59:24Z

blueice/likelihood.py

+                for sn, shape_parameters in self.source_shape_parameters.items():
+                    self.source_morphers[sn] = MORPHERS[self.config['morpher']](
+                        self.config.get('morpher_config', {}),
+                        shape_parameters


almost certainly overkill, but for readability, perhaps we can make a tiny minimorpher = lambda shape_parameters: MORPHERS[self.config['morpher']]( self.config.get('morpher_config', {}), shape_parameters) ?

Good idea -- done! (though I made a proper def to make flake8 happy 😊

hammannr · 2024-11-07T09:25:51Z

@kdund, @JelleAalbers I know you're both very busy. We would need to resolve the missing unittest but apart from this, I think this could be merged. Should I try to find someone else to have a look at the changes to the code so that we can proceed?

JelleAalbers · 2024-11-07T09:43:29Z

Hi Robert, for me it would take quite a bit of time to import blueice in my head again, and I can't promise to do it soon. If you know of someone else who is currently using blueice I'm happy to give them access (if they don't have it already).

kdund · 2024-11-07T14:56:52Z

Hi, both! Sorry, Robert, I started but then was very lax in finishing it… Can I look at it tomorrow? Knut

…

On 7 Nov 2024, at 10:43, Jelle Aalbers ***@***.***> wrote: Hi Robert, for me it would take quite a bit of time to import blueice in my head again, and I can't promise to do it soon. If you know of someone else who is currently using blueice I'm happy to give them access (if they don't have it already). — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: ***@***.***>

hammannr · 2024-11-07T15:59:39Z

Thanks a lot @kdund , that would be perfect! 😊

kdund

Currently failing with

============================================ FAILURES =============================================
___________________________________________ test_rates ____________________________________________

    def test_rates():
        m = Model(conf_for_test(n_sources=1))
        np.testing.assert_array_equal(m.expected_events(), np.array([1000]))

        m.config['livetime_days'] = 2
>       np.testing.assert_array_equal(m.expected_events(), np.array([2000]))

tests/test_model.py:9:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

args = (<built-in function eq>, array([1000.]), array([2000]))
kwds = {'err_msg': '', 'header': 'Arrays are not equal', 'strict': False, 'verbose': True}

    @wraps(func)
    def inner(*args, **kwds):
        with self._recreate_cm():
>           return func(*args, **kwds)
E           AssertionError:
E           Arrays are not equal
E
E           Mismatched elements: 1 / 1 (100%)
E           Max absolute difference: 1000.
E           Max relative difference: 0.5
E            x: array([1000.])
E            y: array([2000])

../../../opt/miniconda3/envs/myenv/lib/python3.8/contextlib.py:75: AssertionError
======================================== warnings summary =========================================
tests/test_inference.py::test_fit_scipy
tests/test_inference.py::test_fit_scipy
tests/test_inference.py::test_fit_scipy
tests/test_inference.py::test_fit_scipy
  /Users/kdund/opt/miniconda3/envs/myenv/lib/python3.8/site-packages/scipy/optimize/_numdiff.py:576: RuntimeWarning: invalid value encountered in subtract
    df = fun(x) - f0

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
===================================== short test summary info =====================================
FAILED tests/test_model.py::test_rates - AssertionError:

kdund · 2024-11-08T12:35:24Z

blueice/likelihood.py

-                self.n_model_events_interpolator = self.morpher.make_interpolator(f=lambda m: m.pmf_grids()[1],
-                                                                                  extra_dims=list(self.ps.shape),
-                                                                                  anchor_models=self.anchor_models)
+            if self.source_wise_interpolation:


For readability, might it be better to just have "if self.source_wise_interpolation: raise error" and then have the rest not in an else block but just normal indentation?

ah yes -- I basically wanted to use this as a placeholder if someone is eager to implement this at one point but of course you're right that if the if clause raises an error we don't need the else -- removed! 👍

kdund · 2024-11-08T13:02:58Z

blueice/likelihood.py

+    def _get_shape_indices(self, source_name):
+        """Return the indices of the shape parameters used by the source."""
+        shape_keys = self.source_shape_parameters[source_name].keys()
+        return [i for i, k in enumerate(self.shape_parameters.keys()) if k in shape_keys]


for arrays should we return it sorted? Or return a set?

Maybe I misunderstand your suggestion but I think the return should by this definition be a sorted list, or not? Also entries should be unique.

kdund · 2024-11-08T13:58:40Z

blueice/model.py

        """
        if s is None:
            return np.array([self.expected_events(s) for s in self.sources])
-        return s.events_per_day * self.config['livetime_days'] * s.fraction_in_range * s.config['rate_multiplier']
+        return s.expected_events


Are we loosing lifetime scaling here?

Answer: yes, @hammannr had told me but I overlooked it-- what changes is that the expected_events is now per-source and not adapted to change on-the-go (you would do that by changing livetime instead)
I approve of that solution.

hammannr · 2024-11-08T14:36:24Z

I now adapted the unittest so that the config of the source is overwritten. This is a bit of an antipattern but I think it's still useful to check that this works.

kdund

Thanks, Robert-- in my opinion this may be merged now

hammannr · 2024-11-08T14:49:48Z

Thanks a lot for the review @kdund! 😊 -- will then merge

hammannr added 4 commits October 10, 2024 11:32

first idea of source-wise interpolation

1324a30

move expected_events to source

d92246a

fix expected_events property

0fdb5b2

add source-wise mus_interpolator

5baa97a

hammannr added the enhancement label Oct 11, 2024

hammannr linked an issue Oct 11, 2024 that may be closed by this pull request

Move shape parameter grids, interpolation to be per source #31

Closed

hammannr mentioned this pull request Oct 11, 2024

Source-wise interpolation XENONnT/alea#228

Merged

hammannr added 6 commits October 11, 2024 12:01

some cleanup..

324b3e0

add source parameters method

305fedc

add unittest

cb20f4a

base decision on dont_hash_settings

98a9a2b

add efficiecnies back to source shape parameters

7a5d797

fix ignore parameters

ff49044

hammannr marked this pull request as ready for review October 11, 2024 12:56

hammannr requested review from JelleAalbers and kdund October 11, 2024 12:57

clean up ps_interpolator

0e99fa2

kdund reviewed Oct 22, 2024

View reviewed changes

add create_source_morpher function for readability

47e3ba1

kdund reviewed Nov 8, 2024

View reviewed changes

fix unittest

9b6cad4

improve readability

17b9abd

kdund approved these changes Nov 8, 2024

View reviewed changes

hammannr merged commit 0206fd9 into master Nov 8, 2024
2 checks passed

hammannr deleted the source_wise_interpolation branch November 8, 2024 14:49

hammannr mentioned this pull request Nov 16, 2024

Fix source index mismatch #47

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Source-wise interpolation #46

Source-wise interpolation #46

hammannr commented Oct 10, 2024

hammannr commented Oct 10, 2024

hammannr commented Oct 11, 2024

hammannr commented Oct 16, 2024

hammannr commented Oct 17, 2024

kdund Oct 22, 2024

hammannr Oct 29, 2024

hammannr commented Nov 7, 2024

JelleAalbers commented Nov 7, 2024

kdund commented Nov 7, 2024 via email

hammannr commented Nov 7, 2024

kdund left a comment

kdund Nov 8, 2024

hammannr Nov 8, 2024

kdund Nov 8, 2024

hammannr Nov 8, 2024

kdund Nov 8, 2024

kdund Nov 8, 2024

hammannr commented Nov 8, 2024

kdund left a comment

hammannr commented Nov 8, 2024

Source-wise interpolation #46

Source-wise interpolation #46

Conversation

hammannr commented Oct 10, 2024

hammannr commented Oct 10, 2024

hammannr commented Oct 11, 2024

hammannr commented Oct 16, 2024

hammannr commented Oct 17, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hammannr commented Nov 7, 2024

JelleAalbers commented Nov 7, 2024

kdund commented Nov 7, 2024 via email

hammannr commented Nov 7, 2024

kdund left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hammannr commented Nov 8, 2024

kdund left a comment

Choose a reason for hiding this comment

hammannr commented Nov 8, 2024