Monomorphize dropped functions #6734

kripken · 2024-07-11T22:55:13Z

We now consider a drop to be part of the call context: If we see

(drop
  (call $foo)
)

(func $foo (result i32)
  (i32.const 42)
)

Then we'd monomorphize to this:

(call $foo_1)  ;; call the specialized function instead

(func $foo_1   ;; the specialized function returns nothing
  (drop        ;; the drop was moved into here
    (i32.const 42)
  )
)

With the drop now in the called function, we may be able to optimize out unused work.

Refactor a bit of code out of DAE that we can reuse here, into a new return-utils.h.

kripken · 2024-07-11T23:06:40Z

Oops, fuzzer quickly found that that old code I was so happy to reuse does not support return calls yet. DAE must handle those some other way. Should be an easy fix though.

tlively · 2024-07-11T23:35:34Z

src/ir/return-utils.h

+ curr->finalize();
+ refinalize = true;
+
+ replaceCurrent(Builder(*getModule()).makeDrop(curr));


This needs to be followed by a return to avoid going to on execute subsequent code that should have been dead.

Right, thanks, the semantics include the return 😄 ... Fixed.

Hmm, and now that I think about it, this also needs to do the thing where it breaks out of the function body before doing the call to make sure exceptions propagate correctly.

Good point. And actually, thinking more on this, we should not be changing return calls here. This is not like inlining where we remove a call, so removing the return part of a return call is ok - here we'd just be removing the return part in a copy of the function. So we could break programs by blowing up the stack.

To fix that, I made the pass not monomorphize drops when they'd cause us to need to remove a return call's return.

I don't understand why we shouldn't optimize return calls in monomorphized bodies. If you know the caller drops the result, you can change a return call in the callee to be a branch + call + drop sequence, which can allow further optimizations like monomorphizing the former return-callee now that we see that its result is also dropped.

It can definitely help in some cases, yeah, but it can break others. Imagine that A calls B which calls A etc. in some very deep stack before it stops. If we monomorphize them both, then even if we save 90% of the work inside them, if the calls are no longer return calls then we'd be using potentially enough stack to error.

Oh, I see. FWIW, inlining also has this problem because the frame sizes increase with the depth of inlining, even through tail calls. I don't think that's enough reason to avoid optimizing, though.

I worry less about frame sizes increasing as that's generally going to be a "moderate" factor, like maybe the frame doubles in size. Maybe it even 10x grows somehow? But if we turn tail calls into normal calls for a language that does loops using tail calls then we can turn what were 1,000,000 loop iterations into 1,000,000 nested calls, can't we? 😱

Hmm, I guess if we wanted to avoid that problem, we would want to make the monomorphization of a function containing a return call contingent on the monomorphization of the return-callee as well. So the entire return-call graph reachable from the original monomorphized function would be monomorphized as a unit.

Yeah, I think that's right.

Another option might be to only monomorphize in cases that we shrink the called function so much it will get inlined immediately, which eliminates a call.

tlively · 2024-07-11T23:42:24Z

test/lit/passes/monomorphize-drop.wast

+ ;; This is monomorphized in ALWAYS (as the drop means the call context is
+ ;; not trivial), but not in CAREFUL mode (as there is no significant benefit
+ ;; from optimizations on the monomorphized function - the import is opaque
+ ;; work we cannot do anything with, even if we see it is handed consts).


Is precomputing the xor and add not enough of a benefit?

Our cost analysis may not be precise enough, but we have this:

(call $import (i32.xor (local.get $0) (local.get $1) ) (i32.add (local.get $0) (local.get $1) ) ) vs (drop (call $import (i32.const 7) (i32.const 7) ) )

The call is the same. local.get has 0 cost (we assume it might already be in a register), so that vanishes. drop also has cost 0. And we give consts and super-simple math operations like i32.add a cost of 1, so they end up the same.

Oh interesting, consts and simple math having the same cost means that we don't prefer replacing math with consts. Could the code size difference tip the scales if it were large enough?

We'd need to also measure code size for that, which it looks like we're not doing. In this case, I think that is the right outcome actually: We are duplicating code, so even if we shrink the function we are still growing the total size. So this only makes sense if we actually remove work.

Though in general I think you're right, it's not obvious that considering consts and simple math as the same is right. I think we did simple measurements a decade ago and that seemed ok, but I'm not sure how accurate those were then, much less now... worth revisiting.

kripken · 2024-07-12T19:17:55Z

The compile error here happens on clang but not gcc... I'm not sure yet which of them is right.

kripken · 2024-07-12T19:21:29Z

Actually neither... the issue was using std::unordered_map in a way that differed between C++ STL impls (yikes). I worked around it.

tlively

LGTM, since we can leave optimization of return calls to a follow-up if we decide to move forward with that.

One thing that might read test readability, though, would be to use separate modules for the separate cases to avoid the monomorphized functions all gathering separately at the bottom of the file.

kripken · 2024-07-12T22:52:24Z

Good point, thanks. I split them up more now. The cost is repeating a few imports and such but reading it now it definitely looks clearer, with less distance from the function to its monomorphized version.

kripken added 16 commits July 11, 2024 11:14

drop

fd6ab9d

yolo

170a6f0

yolo

206339e

yolo

79b695e

yolo

6595096

test+fix

35a2483

test+fix

ba38860

test

ed76396

test

d64ef3a

test

16da290

work

81661bc

Merge remote-tracking branch 'origin/main' into monodrop

75c264b

work

44b20cc

comment

6047297

fix

d460fba

fix

923387f

kripken requested a review from tlively July 11, 2024 22:55

kripken added 2 commits July 11, 2024 16:25

handle return_call*

9c67aa8

format

b04fdc3

tlively reviewed Jul 11, 2024

View reviewed changes

kripken added 9 commits July 11, 2024 16:49

add missing returns

cc1356f

format

5ea3675

rework return_calls

4d6f9b4

builds

635e91d

format

48e1aaa

whoops

7ed19e8

work

099872a

tests

6a5419f

test.work

561714f

kripken added 5 commits July 12, 2024 11:18

test.work

403fdac

fix.test

ce13223

fix

c6a653e

work

021b396

test

4efae3d

Avoid a libc++ compiler error

b8b5fa7

kripken added 2 commits July 12, 2024 12:27

format

916e73f

todo

6a0cbac

tlively approved these changes Jul 12, 2024

View reviewed changes

modularize test

339c96c

kripken merged commit d2a48af into WebAssembly:main Jul 12, 2024
13 checks passed

kripken deleted the monodrop branch July 12, 2024 23:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Monomorphize dropped functions #6734

Monomorphize dropped functions #6734

kripken commented Jul 11, 2024

kripken commented Jul 11, 2024

tlively Jul 11, 2024

kripken Jul 11, 2024

tlively Jul 12, 2024

kripken Jul 12, 2024

tlively Jul 12, 2024

kripken Jul 12, 2024

tlively Jul 12, 2024

kripken Jul 12, 2024

tlively Jul 12, 2024

kripken Jul 12, 2024

tlively Jul 11, 2024

kripken Jul 11, 2024 •

edited

Loading

tlively Jul 12, 2024

kripken Jul 12, 2024

kripken commented Jul 12, 2024

kripken commented Jul 12, 2024

tlively left a comment

kripken commented Jul 12, 2024

Monomorphize dropped functions #6734

Monomorphize dropped functions #6734

Conversation

kripken commented Jul 11, 2024

kripken commented Jul 11, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kripken Jul 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kripken commented Jul 12, 2024

kripken commented Jul 12, 2024

tlively left a comment

Choose a reason for hiding this comment

kripken commented Jul 12, 2024

kripken Jul 11, 2024 •

edited

Loading