Optimize IntMap.alter using unboxed sums. #523

m-renaud · 2018-02-01T05:32:27Z

Use ptrEq and unboxed sums (available in GHC >= 8.2) to track whether or not the map
required modification. If not skip rebuilding a spine and return the original map.

Benchmark results:

Before:

benchmarking alter
time                 881.1 μs   (862.0 μs .. 899.2 μs)
                     0.996 R²   (0.994 R² .. 0.998 R²)
mean                 854.9 μs   (841.8 μs .. 872.5 μs)
std dev              49.55 μs   (38.01 μs .. 69.97 μs)
variance introduced by outliers: 48% (moderately inflated)

After: (41% speedup)

benchmarking alter
time                 513.1 μs   (506.5 μs .. 519.6 μs)
                     0.998 R²   (0.997 R² .. 0.999 R²)
mean                 517.8 μs   (511.7 μs .. 528.5 μs)
std dev              27.05 μs   (17.74 μs .. 47.52 μs)
variance introduced by outliers: 45% (moderately inflated)

We use unboxed sums (available in GHC >= 8.2) to track whether or not the map required modification. Benchmark results: ------------------ Before: ------- benchmarking alter time 881.1 μs (862.0 μs .. 899.2 μs) 0.996 R² (0.994 R² .. 0.998 R²) mean 854.9 μs (841.8 μs .. 872.5 μs) std dev 49.55 μs (38.01 μs .. 69.97 μs) variance introduced by outliers: 48% (moderately inflated) After: ------ benchmarking alter time 513.1 μs (506.5 μs .. 519.6 μs) 0.998 R² (0.997 R² .. 0.999 R²) mean 517.8 μs (511.7 μs .. 528.5 μs) std dev 27.05 μs (17.74 μs .. 47.52 μs) variance introduced by outliers: 45% (moderately inflated)

m-renaud · 2018-02-01T05:33:40Z

I'll add more fine grained benchmarks if I get the chance tomorrow. Sending out for feedback on the code.

treeowl · 2018-02-01T05:50:39Z

Data/IntMap/Internal.hs

+--
+-- If no modifications are made to the map (# (# #) | #) is returned, otherwise
+-- (# | newMap #) is returned.
+alter# :: (Maybe a -> Maybe a) -> Key -> IntMap a -> (# (# #) | IntMap a #)


What happens if you make the function have type Maybe# a -> Maybe# a, where Maybe# is the unboxed version of Maybe? Couldn't we often avoid the Maybe allocation that way? Benchmark!

That function is user defined (passed to alter) so at some point you would have to do the conversion from Maybe to Maybe#. As far as I know coerce wouldn't work for f because they have different representations (Maybe and Maybe# that is), but there may be another way that I'm unaware of. Did you have a specific approach in mind?

I think it's a decent bet that the function we're passed will be small enough to inline. So calling alter# with

\ p -> fromMaybe (f (toMaybe p))

(essentially) should usually avoid any actual Maybes. Or so I imagine.

m-renaud · 2018-02-01T17:51:00Z

So, as is always the case, using unboxed sums makes some operations faster, and others a lot slower :(

Before:

benchmarking alter absent
time                 283.1 μs   (262.8 μs .. 298.1 μs)
                     0.982 R²   (0.977 R² .. 0.991 R²)
mean                 270.9 μs   (262.3 μs .. 283.0 μs)
std dev              32.15 μs   (24.01 μs .. 45.31 μs)
variance introduced by outliers: 84% (severely inflated)
               
benchmarking alter insert
time                 277.1 μs   (273.3 μs .. 282.3 μs)
                     0.997 R²   (0.995 R² .. 0.999 R²)
mean                 275.7 μs   (271.7 μs .. 281.7 μs)
std dev              16.49 μs   (11.93 μs .. 27.52 μs)
variance introduced by outliers: 56% (severely inflated)
               
benchmarking alter update
time                 280.3 μs   (276.7 μs .. 284.3 μs)
                     0.998 R²   (0.998 R² .. 0.999 R²)
mean                 280.5 μs   (277.7 μs .. 284.5 μs)
std dev              11.21 μs   (8.620 μs .. 15.42 μs)
variance introduced by outliers: 36% (moderately inflated)
               
benchmarking alter delete
time                 282.5 μs   (274.8 μs .. 289.7 μs)
                     0.996 R²   (0.995 R² .. 0.998 R²)
mean                 281.1 μs   (276.0 μs .. 290.9 μs)
std dev              21.75 μs   (12.40 μs .. 34.62 μs)
variance introduced by outliers: 68% (severely inflated)
               
benchmarking alter delete absent
time                 243.8 μs   (240.8 μs .. 246.9 μs)
                     0.998 R²   (0.997 R² .. 0.999 R²)
mean                 241.7 μs   (239.1 μs .. 245.8 μs)
std dev              11.08 μs   (8.402 μs .. 14.84 μs)
variance introduced by outliers: 43% (moderately inflated)

After:

benchmarking alter absent (-16%)
time                 237.9 μs   (234.0 μs .. 241.8 μs)
                     0.997 R²   (0.995 R² .. 0.999 R²)
mean                 237.0 μs   (232.9 μs .. 246.5 μs)
std dev              20.16 μs   (7.155 μs .. 32.95 μs)
variance introduced by outliers: 74% (severely inflated)
               
benchmarking alter insert (+290%)
time                 805.2 μs   (783.5 μs .. 828.1 μs)
                     0.990 R²   (0.979 R² .. 0.996 R²)
mean                 822.7 μs   (799.7 μs .. 872.8 μs)
std dev              110.8 μs   (59.63 μs .. 192.9 μs)
variance introduced by outliers: 84% (severely inflated)
               
benchmarking alter update (+250%)
time                 705.5 μs   (699.2 μs .. 712.7 μs)
                     0.997 R²   (0.995 R² .. 0.999 R²)
mean                 703.5 μs   (695.1 μs .. 716.0 μs)
std dev              34.66 μs   (25.56 μs .. 49.30 μs)
variance introduced by outliers: 41% (moderately inflated)
               
benchmarking alter delete (+255%)
time                 720.1 μs   (684.8 μs .. 766.0 μs)
                     0.987 R²   (0.978 R² .. 0.998 R²)
mean                 684.8 μs   (674.2 μs .. 702.9 μs)
std dev              46.01 μs   (28.89 μs .. 72.45 μs)
variance introduced by outliers: 57% (severely inflated)
               
benchmarking alter delete absent (-8%)
time                 225.4 μs   (223.0 μs .. 227.6 μs)
                     0.999 R²   (0.998 R² .. 0.999 R²)
mean                 225.1 μs   (222.4 μs .. 231.9 μs)
std dev              13.72 μs   (6.353 μs .. 26.03 μs)
variance introduced by outliers: 58% (severely inflated)

It looks like in cases where we need to modify the spine anyways the extra pattern matches (despite being unboxed sums) on every Bin have quite a bit of overhead.

treeowl · 2018-02-01T19:41:47Z

Data/IntMap/Internal.hs

+alter# f !k t@(Bin p m l r)
+  | nomatch k p m = case f Nothing of
+                      Nothing -> (# (# #) | #)
+                      Just x -> (# | link k (Tip k x) p t #)


I don't think you're being strict enough with your return values.

Do you mean something like the following?

let !t' = link k (Tip k x) p t in (# | t' #)

Or something else?

Yes, but see the link below for a clearer way.

treeowl · 2018-02-01T19:46:34Z

Data/IntMap/Internal.hs

+--
+-- If no modifications are made to the map (# (# #) | #) is returned, otherwise
+-- (# | newMap #) is returned.
+alter# :: (Maybe a -> Maybe a) -> Key -> IntMap a -> (# (# #) | IntMap a #)


I think it's a decent bet that the function we're passed will be small enough to inline. So calling alter# with

\ p -> fromMaybe (f (toMaybe p))

(essentially) should usually avoid any actual Maybes. Or so I imagine.

treeowl · 2018-02-05T20:42:02Z

Take a look at treeowl@6f6fa02. That's not very well organized yet, and doesn't deal with the lazy version, but it shows how the strict version can be made more eager, how pattern synonyms can cut down dramatically on the noise, and how we can "unbox" the Maybes consumed and produced by the passed function. Feel free to incorporate those ideas into your branch if you wish.

treeowl · 2018-02-05T20:51:03Z

We need to benchmark the unboxed sums version against one that just uses pointer equality on each recursive call.

m-renaud · 2018-02-07T17:29:29Z

Thanks for the info David! I've been realllllly busy the last week, sorry I haven't had time to move this forward as quickly as I'd like. I'm hoping to get around to this sometime this week.

sjakobi · 2020-08-01T23:51:42Z

What's the status here, @m-renaud?

In light of #538 it would be nice to make progress on this.

m-renaud · 2020-08-10T02:22:24Z

Oh geeze it's been a while since I've looked at this, and realistically I won't have substantial time to look into this and incorporate the suggestion's from the comment above in the near future.

treeowl · 2020-08-10T02:24:11Z

I will try to look at it shortly. No guarantees.

sjakobi · 2021-04-29T14:08:30Z

I've marked this PR as a draft to clarify that it's not ready for final review / merge.

Boarders · 2021-09-10T15:43:16Z

Shall I try to take this up and finish it - anyone know what there is to do left?

treeowl · 2021-09-10T16:01:07Z

I don't remember,, sorry. Lost track of this one. Worth trying!

sjakobi · 2021-09-15T14:59:31Z

Shall I try to take this up and finish it

That would be great! :)

anyone know what there is to do left?

I think the minimum is to fix any strictness issues such as #523 (comment).

If the benchmarks show good improvements at this stage, I think this should be good to merge.

Adopting some of noise reduction measures suggested in treeowl@6f6fa02 probably wouldn't hurt though.

Regarding further optimization ideas, I think those could be left for further work.

Kleidukos · 2022-11-02T20:19:32Z

What is the status on this one? Are these benchmarks numbers still the same on GHC 9.2?

treeowl · 2022-11-02T20:37:24Z

What is the status on this one? Are these benchmarks numbers still the same on GHC 9.2?

Has something changed there?

Kleidukos · 2022-11-02T20:47:28Z

One can only hope, that's why I'm asking :)

treeowl · 2022-11-02T21:09:21Z

I haven't tested recently. Do you have time to try?

Kleidukos · 2022-11-02T21:51:05Z

No unfortunately I'm quite committed on Haddock and other stuff for the time being. How's the maintainer structure for containers going btw? Need any help on this front?

m-renaud added the performance label Feb 1, 2018

treeowl reviewed Feb 1, 2018

View reviewed changes

sjakobi added the IntMap label Jul 15, 2020

sjakobi mentioned this pull request Jul 15, 2020

Too restrictive type signatures #538

Open

sjakobi marked this pull request as draft April 29, 2021 14:07

Boarders mentioned this pull request Sep 10, 2021

Unboxed sums for performance (Discussion) #801

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize IntMap.alter using unboxed sums. #523

Optimize IntMap.alter using unboxed sums. #523

m-renaud commented Feb 1, 2018 •

edited

Loading

m-renaud commented Feb 1, 2018

treeowl Feb 1, 2018

m-renaud Feb 1, 2018

treeowl Feb 1, 2018

m-renaud commented Feb 1, 2018

treeowl Feb 1, 2018

m-renaud Feb 6, 2018

treeowl Feb 6, 2018

treeowl Feb 1, 2018

treeowl commented Feb 5, 2018

treeowl commented Feb 5, 2018

m-renaud commented Feb 7, 2018

sjakobi commented Aug 1, 2020

m-renaud commented Aug 10, 2020

treeowl commented Aug 10, 2020

sjakobi commented Apr 29, 2021

Boarders commented Sep 10, 2021

treeowl commented Sep 10, 2021

sjakobi commented Sep 15, 2021

Kleidukos commented Nov 2, 2022 •

edited

Loading

treeowl commented Nov 2, 2022

Kleidukos commented Nov 2, 2022

treeowl commented Nov 2, 2022

Kleidukos commented Nov 2, 2022

Optimize IntMap.alter using unboxed sums. #523

Are you sure you want to change the base?

Optimize IntMap.alter using unboxed sums. #523

Conversation

m-renaud commented Feb 1, 2018 • edited Loading

Benchmark results:

m-renaud commented Feb 1, 2018

treeowl Feb 1, 2018

Choose a reason for hiding this comment

m-renaud Feb 1, 2018

Choose a reason for hiding this comment

treeowl Feb 1, 2018

Choose a reason for hiding this comment

m-renaud commented Feb 1, 2018

treeowl Feb 1, 2018

Choose a reason for hiding this comment

m-renaud Feb 6, 2018

Choose a reason for hiding this comment

treeowl Feb 6, 2018

Choose a reason for hiding this comment

treeowl Feb 1, 2018

Choose a reason for hiding this comment

treeowl commented Feb 5, 2018

treeowl commented Feb 5, 2018

m-renaud commented Feb 7, 2018

sjakobi commented Aug 1, 2020

m-renaud commented Aug 10, 2020

treeowl commented Aug 10, 2020

sjakobi commented Apr 29, 2021

Boarders commented Sep 10, 2021

treeowl commented Sep 10, 2021

sjakobi commented Sep 15, 2021

Kleidukos commented Nov 2, 2022 • edited Loading

treeowl commented Nov 2, 2022

Kleidukos commented Nov 2, 2022

treeowl commented Nov 2, 2022

Kleidukos commented Nov 2, 2022

m-renaud commented Feb 1, 2018 •

edited

Loading

Kleidukos commented Nov 2, 2022 •

edited

Loading