JIT: Enable compaction of all `BBJ_ALWAYS` blocks #103785

amanasifkhalid · 2024-06-20T21:32:13Z

Part of #93020. Removes any reference to bbNext in block compaction, and instead considers compacting each block with its jump target regardless of their relative positions in the blocklist. This can churn the flowgraph significantly, but so long as this churn happens before block layout, we can rely on optOptimizeLayout to create a sensible layout regardless of the initial churn. However, since we can run fgUpdateFlowGraph after we've reordered blocks (such as during lowering), I updated fgUpdateFlowGraph's signature to control whether we enable compaction of non-contiguous blocks to avoid changing the block layout.

The logic for updating profile data during compaction looked like it could be simplified, though I'm not sure if my change is overly simplistic; this change seems to have churned fgComputeBlockWeights, hence the large PerfScore diffs. Diffs as a whole are dramatic, though they're inflated largely by libraries_tests.

cc @dotnet/jit-contrib, @AndyAyersMS PTAL. Thanks!

dotnet-policy-service · 2024-06-20T21:32:50Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

AndyAyersMS

Interesting (and impressive) that this has so much impact.

Can you see if this further improves the case in #4324? RBO was blocked there because of un-compacted blocks.

AndyAyersMS · 2024-06-21T15:28:45Z

src/coreclr/jit/fgopt.cpp

-            noway_assert((block->bbWeight == BB_ZERO_WEIGHT) || (bNext->bbWeight == BB_ZERO_WEIGHT));
-            block->bbSetRunRarely();
-        }
+    if (hasProfileWeight)


ineheritWeight already takes care of setting the flag.

I was running into cases where we'd compact a block with BBF_PROF_WEIGHT set, with a block without the flag, so inheritWeight would unset the flag. I'm guessing we'd want to keep this flag set in such cases, so I added this workaround.

Ok. If you remember the cases it might be worth seeing how we end up with a mixture of profiled and unprofiled blocks. Would be nice to systematically reduce how often this happens.

I can take another look, but I first noticed this when we would compact a block with profile data with an internal block.

amanasifkhalid · 2024-06-21T16:04:37Z

Can you see if this further improves the case in #4324? RBO was blocked there because of un-compacted blocks.

Codegen is identical with and without this change; not sure how much of this is driven by weird block layout decisions (without PGO data, ordering conditional blocks' successors is pretty arbitrary), but IG07 suggests there's room to improve?

G_M31901_IG01:  ;; offset=0x0000
       push     rbx
       sub      rsp, 32
       mov      ebx, edx
                                                ;; size=7 bbWeight=1 PerfScore 1.50
G_M31901_IG02:  ;; offset=0x0007
       mov      rcx, gword ptr [rcx+0x08]
       mov      rcx, gword ptr [rcx+0x08]
       test     rcx, rcx
       je       SHORT G_M31901_IG04
                                                ;; size=13 bbWeight=1 PerfScore 5.25
G_M31901_IG03:  ;; offset=0x0014
       mov      r9d, dword ptr [rcx+0x10]
       test     r9d, r9d
       je       SHORT G_M31901_IG06
       mov      rcx, gword ptr [rcx+0x08]
       mov      edx, ebx
       xor      r8d, r8d
       call     [System.Array:IndexOf[int](int[],int,int,int):int]
       mov      ecx, eax
       not      ecx
       shr      ecx, 31
       jmp      SHORT G_M31901_IG07
                                                ;; size=33 bbWeight=0.50 PerfScore 5.88
G_M31901_IG04:  ;; offset=0x0035
       mov      eax, ebx
                                                ;; size=2 bbWeight=0.50 PerfScore 0.12
G_M31901_IG05:  ;; offset=0x0037
       add      rsp, 32
       pop      rbx
       ret
                                                ;; size=6 bbWeight=0.50 PerfScore 0.88
G_M31901_IG06:  ;; offset=0x003D
       xor      ecx, ecx
                                                ;; size=2 bbWeight=0.50 PerfScore 0.12
G_M31901_IG07:  ;; offset=0x003F
       test     ecx, ecx
       je       SHORT G_M31901_IG04
       xor      eax, eax
                                                ;; size=6 bbWeight=0.50 PerfScore 0.75
G_M31901_IG08:  ;; offset=0x0045
       add      rsp, 32
       pop      rbx
       ret
                                                ;; size=6 bbWeight=0.50 PerfScore 0.88

; Total bytes of code 75, prolog size 5, PerfScore 15.38, instruction count 29, allocated bytes for code 75 (MethodHash=031d8362) for method Visitor:Test(int):int:this (FullOpts)
; ============================================================

AndyAyersMS · 2024-06-21T16:11:22Z

IG07 suggests there's room to improve?

Yeah, IG06/IG07 is a missed case for RBO. No need to retest a value we've just set to zero

AndyAyersMS · 2024-06-25T16:29:41Z

Regressions:

[Perf] Windows/x64: 33 Regressions on 6/21/2024 6:31:57 PM #103972

Improvements:

[Perf] Linux/x64: 7 Improvements on 6/21/2024 6:31:57 PM perf-autofiling-issues#36893
[Perf] Windows/x64: 2 Improvements on 6/21/2024 6:31:57 PM perf-autofiling-issues#37001 (maybe)

amanasifkhalid added 7 commits June 17, 2024 21:36

Use editing iterator

db90959

Enable compaction of block target

76f4d29

Merge from main

f0c2ef9

Rename

5e02f85

Clean up asserts

dd9bbf9

Fix profile data transfer

fe95cba

Don't churn flowgraph with compaction after layout

483e473

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 20, 2024

dotnet-policy-service bot assigned amanasifkhalid Jun 20, 2024

Fix compacting call-finally pairs

9250cc8

amanasifkhalid marked this pull request as ready for review June 21, 2024 14:50

AndyAyersMS approved these changes Jun 21, 2024

View reviewed changes

amanasifkhalid merged commit 3b14b0c into dotnet:main Jun 21, 2024
107 checks passed

amanasifkhalid deleted the compact-blocks branch June 21, 2024 16:15

rzikm pushed a commit to rzikm/dotnet-runtime that referenced this pull request Jun 24, 2024

JIT: Enable compaction of all BBJ_ALWAYS blocks (dotnet#103785)

05dfa59

jakobbotsch mentioned this pull request Jun 24, 2024

Test failure: GC/LargeMemory/Regressions/largearraytest/largearraytest.sh #103875

Closed

DrewScoggins mentioned this pull request Jun 25, 2024

[Perf] Windows/x64: 33 Regressions on 6/21/2024 6:31:57 PM #103972

Open

This was referenced Jun 25, 2024

[Perf] Windows/x64: 2 Improvements on 6/21/2024 6:31:57 PM dotnet/perf-autofiling-issues#37001

Closed

[Perf] Windows/arm64: 5 Improvements on 6/22/2024 1:11:03 AM dotnet/perf-autofiling-issues#37160

Closed

amanasifkhalid mentioned this pull request Jul 10, 2024

JIT: Straighten out flow during early jump threading #104603

Merged

amanasifkhalid mentioned this pull request Jul 17, 2024

JIT: Don't do aggressive block compaction too early #105041

Closed

github-actions bot locked and limited conversation to collaborators Jul 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: Enable compaction of all `BBJ_ALWAYS` blocks #103785

JIT: Enable compaction of all `BBJ_ALWAYS` blocks #103785

amanasifkhalid commented Jun 20, 2024 •

edited

Loading

dotnet-policy-service bot commented Jun 20, 2024

AndyAyersMS left a comment

AndyAyersMS Jun 21, 2024

amanasifkhalid Jun 21, 2024

AndyAyersMS Jun 21, 2024

amanasifkhalid Jun 21, 2024

amanasifkhalid commented Jun 21, 2024

AndyAyersMS commented Jun 21, 2024

AndyAyersMS commented Jun 25, 2024 •

edited

Loading

JIT: Enable compaction of all BBJ_ALWAYS blocks #103785

JIT: Enable compaction of all BBJ_ALWAYS blocks #103785

Conversation

amanasifkhalid commented Jun 20, 2024 • edited Loading

dotnet-policy-service bot commented Jun 20, 2024

AndyAyersMS left a comment

Choose a reason for hiding this comment

AndyAyersMS Jun 21, 2024

Choose a reason for hiding this comment

amanasifkhalid Jun 21, 2024

Choose a reason for hiding this comment

AndyAyersMS Jun 21, 2024

Choose a reason for hiding this comment

amanasifkhalid Jun 21, 2024

Choose a reason for hiding this comment

amanasifkhalid commented Jun 21, 2024

AndyAyersMS commented Jun 21, 2024

AndyAyersMS commented Jun 25, 2024 • edited Loading

JIT: Enable compaction of all `BBJ_ALWAYS` blocks #103785

JIT: Enable compaction of all `BBJ_ALWAYS` blocks #103785

amanasifkhalid commented Jun 20, 2024 •

edited

Loading

AndyAyersMS commented Jun 25, 2024 •

edited

Loading