JIT: Set bbJumpKind and bbJumpDest during block initialization #93415

amanasifkhalid · 2023-10-12T20:36:53Z

Followup to #93152. This refactor enforces new invariants on BasicBlock's bbJumpKind and bbJumpDest. In particular, whenever bbJumpKind is a kind that must have a jump target, bbJumpDest must be set, else bbJumpDest must be null. This means bbJumpKind and bbJumpDest must be simultaneously initialized/updated when creating/converting a jump block; previously, we initialized blocks with their jump kind specified, and later set their jump targets accordingly.

ghost · 2023-10-12T20:37:05Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Followup to #93152. This refactor enforces new invariants on BasicBlock's bbJumpKind and bbJumpDest. In particular, whenever bbJumpKind is a kind that must have a jump target, bbJumpDest must be set, else bbJumpDest must be null. This means bbJumpKind and bbJumpDest must be simultaneously initialized/updated when creating/converting a jump block; previously, we initialized blocks with their jump kind specified, and later set their jump targets accordingly.

Author:	amanasifkhalid
Assignees:	amanasifkhalid
Labels:	`area-CodeGen-coreclr`
Milestone:	-

amanasifkhalid · 2023-10-12T20:38:31Z

I'm waiting to rebase on top of #93377. Then I'll open for review.

amanasifkhalid · 2023-10-16T22:10:33Z

No asmdiffs. TP diffs on Linux x64 and Windows x86. The regressions for the latter seem a bit dramatic to be attributed to the additional bbJumpDest = nullptr instructions; any ideas?

amanasifkhalid · 2023-10-16T23:28:31Z

superpmi-replay failure is #93527

amanasifkhalid · 2023-10-17T14:55:35Z

cc @dotnet/jit-contrib , @AndyAyersMS PTAL.

AndyAyersMS

Overall this looks good.

For throughput, it would be good to do the next level of investigation. I can't remember if we do these lab TP runs with PGO-enhanced release builds, but if we do it's also possible the PGO data we have becomes stale with your changes. You can try and build non-pgo baseline and diff locally and run your own TP diffs perhaps.

There are some pin based tools you can try and use to pin down where the extra time is going. Let me see if I can dig up a pointer to this for you (or maybe someone else on @dotnet/jit-contrib can do this quicker than I can).

Thought for possible follow-up: I wonder if there is some better way to set a "temporary" jump target that ensures that later on we actually have remembered to change it.

AndyAyersMS · 2023-10-17T16:50:03Z

src/coreclr/jit/importer.cpp

-                callBlock->SetJumpKind(BBJ_CALLFINALLY DEBUG_ARG(this)); // convert the BBJ_LEAVE to BBJ_CALLFINALLY
+
+                assert(callBlock->HasJump());
+                fgRemoveRefPred(callBlock->GetJumpDest(), callBlock);


Changes to impImportLeave always make me nervous.

Likewise -- this was the trickiest of my changes to get right. Most of my debug time was spent on one failing method in the CoreCLR tests supermi collection that exercised an unusual path (the bug in that case was caused by initializing new blocks with jumps as BBJ_NONE while they don't have a jump target, thus triggering a different path in fgNewBBInRegion; changing this approach to use the actual jump kind and a temporary jump target fixed that).

EgorBo · 2023-10-17T17:09:12Z

I can't remember if we do these lab TP runs with PGO-enhanced release builds,

We disable native PGO for these runs so it should be ok, It's interesting that TP is actually improved for MSVC ones.

AndyAyersMS · 2023-10-17T17:51:54Z

For throughput, it would be good to do the next level of investigation.

I would look at the windows x86 results in particular, those should be the easiest to investigate. Note that the "coreclr tests" collection includes a lot of unusual code, so it may just be one particular code shape that's causing this.

amanasifkhalid · 2023-10-17T17:54:04Z

Thought for possible follow-up: I wonder if there is some better way to set a "temporary" jump target that ensures that later on we actually have remembered to change it.

How about some public static BasicBlock member for this that we can pass to SetJumpKindAndTarget, like static BasicBlock* tempJumpDest = 0xDEADBEEF? Then we can assert the jump target isn't equal to this before reading it, like in GetJumpDest(), HasJumpTo, etc. Though I'm not sure if a hard-coded pointer value is the best approach here...

AndyAyersMS · 2023-10-17T18:07:01Z

Thought for possible follow-up: I wonder if there is some better way to set a "temporary" jump target that ensures that later on we actually have remembered to change it.

How about some public static BasicBlock member for this that we can pass to SetJumpKindAndTarget, like static BasicBlock* tempJumpDest = 0xDEADBEEF? Then we can assert the jump target isn't equal to this before reading it, like in GetJumpDest(), HasJumpTo, etc. Though I'm not sure if a hard-coded pointer value is the best approach here...

One idea to explore is to just allocate some special block (call it BB0) that can be used as a sentinel and a temp target, and then (say in post-phase diagnostics) we could verify no jump target targets that block. It turns out we already do some things like this, see for example SsaBuilder::SetupBBRoot, and there are other potential uses; for instance, it could serve as an anchor for both fgFirstBB and fgLastBB, and it could be the def block for uninitialized SSA / parms.

However we'd need to be careful that this block doesn't complicate life elsewhere.

AndyAyersMS

bbTempJumpDest seems like a nice option here. Thanks.

Any insight into the TP impact yet?

amanasifkhalid · 2023-10-18T15:13:10Z

Any insight into the TP impact yet?

Not much yet. I've identified the method contexts in the CoreCLR tests collection with the largest TP diffs when targeting x86, but I don't know of any way to find the actual methods/tests corresponding to those contexts. I see under runtime/artifacts/spmi/pintools/1.0/windows/intel64/bin, there is pinjitprofiling.dll. I don't know what it exactly does -- I can't find any documentation on it, and I cannot run it on my win-x64 box (for some reason, it won't open vcruntime140.dll, regardless of the Visual C++ runtime installed; trying this on another machine to see if it repros) -- but I imagine it might reveal information more useful than just total instructions executed.

@AndyAyersMS do you have any suggestions for what tooling might be useful here? Thank you!

Edit: @jakobbotsch shared a pin tool with me that can profile JIT functions. Trying it now...

amanasifkhalid · 2023-10-18T23:14:22Z

@AndyAyersMS So it looks like about 30% of the instruction diff on Windows x86 comes from calls to Compiler::bbNewBasicBlock, and another 20% comes from Compiler::impImportLeave. I think the bbNewBasicBlock increase comes from the fact that it now takes two function calls (e.g. we call the version that takes in a jump kind and jump target, which calls the version which takes no arguments and actually creates the block) instead of one, and the no-arg bbNewBasicBlock is probably too big to be inlined. I can try refactoring it locally to see if we can reduce this diff, though that amount of code duplication probably won't be acceptable to check in.

As for impImportLeave, I'm guessing the diff comes from setting a new block's jump target to null (in Release builds) during initialization, and then setting it to its actual jump later (whereas we previously didn't touch the jump target at all during initialization). I can try to refactor the methods that create blocks to not bother setting the jump target at all in release builds if the jump target passed in is null. If we can get rid of the diff this way, would we be okay introducing all the additional conditional compilation, plus giving up the invariant that bbJumpDest == null for blocks without jumps in Release builds? We don't currently rely on that invariant outside of asserts, so I don't think this would break anything in Release.

Also, the diff in total instructions executed is only +0.09% for me when replaying the CoreCLR tests collection locally. Not sure why it's so much bigger on Helix machines...

AndyAyersMS · 2023-10-18T23:17:34Z

Thanks for digging in. I'm ok with taking this change as is, since the TP diff is 32 bit and only in coreclr_tests which has some very odd methods.

amanasifkhalid · 2023-10-18T23:24:37Z

superpmi-replay failure is #93527.

jakobbotsch · 2023-10-23T07:11:11Z

I just came across bbTempJumpDest while merging up some stuff, and I really do not think that is a good idea. It introduces a semantic difference between debug/checked and release builds, where some blocks now have successors only in debug/checked builds. Even if that's only for a short while that makes me nervous. In the cases that require this I would much rather see the first BB created with a different kind until they can be linked together (e.g. insert it as BBJ_NONE, switch it to BBJ_COND after).

It does raise another question: if we actively have to fight the checking, then what is the benefit of having it? It seems like it just creates usability problems. I do not recall seeing us having bugs previously around "half baked flow graphs", and I would much rather allow us to build flow graph structures naturally and then only check them once they are completed (e.g. at the end of the phase, as a normal post-phase check).

amanasifkhalid · 2023-10-23T14:25:37Z

It introduces a semantic difference between debug/checked and release builds

I enabled bbTempJumpDest in Debug/Checked builds only since the asserts that check the validity of the jump target won't run in Release builds anyway, though I share some of your discomfort with the differences between the builds. I can remove the conditional compilation around this.

I would much rather see the first BB created with a different kind until they can be linked together (e.g. insert it as BBJ_NONE, switch it to BBJ_COND after).

I initially took this approach, but this introduced subtle bugs in a few places. For example, in Compiler::impImportLeave, sometimes we would create a BBJ_ALWAYS without its jump target existing yet, so we would initialize it as BBJ_NONE. But before converting it to BBJ_ALWAYS, we would create and insert a new block after it, and the method that inserts the new block would run some conditional logic based on the fact that the previous block "falls through" (even though we intend for it to be a BBJ_ALWAYS). So temporarily setting the block to BBJ_NONE didn't work in a few exceptional cases.

I'm also cognizant of the fact that we plan to either significantly restrict our usage of BBJ_NONE before block reordering or remove it altogether (see here), so I'm hesitant to introduce more usages of BBJ_NONE that may have to be removed in the near future. I guess we could introduce another jump kind (like some sort of BBJ_UNINIT), but that only reaffirms your next point...

It does raise another question: if we actively have to fight the checking, then what is the benefit of having it?

I certainly agree with you that these weird cases aren't elegant with my recent changes. I take solace in the fact that these cases are the exception, and not the norm -- if I recall correctly, we're only using bbTempJumpDest in a few places. As for potential bugs around our flow graph code, the only inconsistency I found was our nulling of bbJumpDest for blocks that are converted from a kind with jumps to a kind without; sometimes we'd set bbJumpDest to null, and other times we wouldn't. This seemed like it could be problematic because we have a few places in the JIT where we check if the jump target is null (rather than if the jump kind is valid) in conditional statements (not just asserts), but this didn't seem to cause issues before. With the upcoming churn from the block reordering changes, perhaps these checks will prove useful. But maybe the benefit we get from these checks during large, infrequent refactors don't justify the long-term usability challenges.

@AndyAyersMS what do you think?

AndyAyersMS · 2023-10-23T15:21:38Z

One of the hopes here is that we can eventually automate most of the ref count and pred list maintenance. So it seems preferable to not to create something with implicitly incorrect flow and then fix it up; instead we should have some transient "incomplete" state with obviously incorrect flow that must resolved.

I guess we could introduce another jump kind (like some sort of BBJ_UNINIT), but that only reaffirms your next point...

Jakob would you find this more palatable? It would not be debug-only, would not be allowed at end of phase, and most of the code that checks block kinds should already have default/error cases for unexpected kinds, so perhaps this would be more robust?

jakobbotsch · 2023-10-23T15:46:38Z

I certainly agree with you that these weird cases aren't elegant with my recent changes. I take solace in the fact that these cases are the exception, and not the norm -- if I recall correctly, we're only using bbTempJumpDest in a few places.

Maybe I was just unlucky, but when merging I ended up in one of these cases.

I like the BBJ_UNINIT idea, it definitely seems more robust (and better than utilizing BBJ_NONE). Though would it help in that impImportLeave case you identified @amanasifkhalid?

I do think a more ideal solution would be to separate the notion of "building" from the notion of "adding". Building new blocks should not have any checks, while the checking/state computations can be made when adding the result of the builder to the flow graph. Not sure how easy it is to design such a graph builder class that can support all of our patterns, though.

amanasifkhalid · 2023-10-23T15:53:41Z

Though would it help in that impImportLeave case you identified @amanasifkhalid?

I think so. I imagine we'd only use BBJ_UNINIT for blocks we intend to have a jump, so we could have BasicBlock::bbFallsThrough return false for it.

I do think a more ideal solution would be to separate the notion of "building" from the notion of "adding".

Maybe we could have Compiler::bbNewBasicBlock set the jump kind and target without any checks (though we might have to refactor bbNewBasicBlock to be a member of BasicBlock so it can access its private members). Then if we just want to modify the jump kind/target, we'll have to use BasicBlock::SetJumpKindAndTarget, which does the checks. How does that sound?

Edit: On second thought, I don't think BBJ_UNINIT will work for the edge cases I ran into. Sometimes we're creating a BBJ_ALWAYS that we don't know the jump target for yet, and other times we're creating a BBJ_COND, so we don't have enough information to determine if BBJ_UNINIT should fall through or not. We could add two versions of BBJ_UNINIT, but I think that starts to get messy. I'm thinking of doing something like Jakob's suggestion by removing the checks in bbNewBasicBlock, and just doing the checks any time we read or write to the jump kind or target after initialization. So if we never initialize the jump target for a block that needs one, we'll assert as soon as we try to use the jump target.

...and remove BasicBlock::bbTempJumpDest, per discussion in #93415. We still assert the jump target is set appropriately whenever it is read/written, and in the majority of cases, we still initialize blocks with their jump kind and target set simultaneously. This change improves usability for the few edge cases (like in Compiler::impImportLeave) where a block's jump target isn't known at initialization.

...and remove BasicBlock::bbTempJumpDest, per discussion in dotnet#93415. We still assert the jump target is set appropriately whenever it is read/written, and in the majority of cases, we still initialize blocks with their jump kind and target set simultaneously. This change improves usability for the few edge cases (like in Compiler::impImportLeave) where a block's jump target isn't known at initialization.

ghost assigned amanasifkhalid Oct 12, 2023

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Oct 12, 2023

This was referenced Oct 13, 2023

Test_EventSource_EtwManifestGeneration* tests failing in CI #48798

Closed

System.Net.Quic.Tests.MsQuicRemoteExecutorTests.SslKeyLogFile_IsCreatedAndFilled fails #93404

Closed

amanasifkhalid closed this Oct 13, 2023

Set jump kind and target during block creation

a94c0a6

amanasifkhalid reopened this Oct 16, 2023

amanasifkhalid force-pushed the bbjump branch from 49846ea to a94c0a6 Compare October 16, 2023 17:43

amanasifkhalid added 3 commits October 16, 2023 13:47

Fix ambiguous call

d558827

Fix SetJumpKind() name

014bd3d

Missed a SetJumpDest() call to convert

966cea0

amanasifkhalid marked this pull request as ready for review October 16, 2023 21:56

amanasifkhalid requested a review from AndyAyersMS October 17, 2023 14:58

amanasifkhalid mentioned this pull request Oct 17, 2023

JIT: Flow Graph Modernization and Improved Block Layout #93020

Closed

51 tasks

AndyAyersMS reviewed Oct 17, 2023

View reviewed changes

amanasifkhalid added 2 commits October 17, 2023 19:10

Add temporary jump target

bdc156d

Fix assert

b5a6d0a

AndyAyersMS reviewed Oct 18, 2023

View reviewed changes

Fix merge conflict

8975aac

build-analysis bot mentioned this pull request Oct 18, 2023

Build crashes in System.Runtime.Serialization.SerializationGuard #92007

Closed

build-analysis bot mentioned this pull request Oct 18, 2023

[mono][tvos] OOM in System.IO.Tests.MemoryStreamTests #92467

Closed

Remove unrelated files

539ac85

AndyAyersMS approved these changes Oct 18, 2023

View reviewed changes

amanasifkhalid merged commit 3a3e1c7 into dotnet:main Oct 18, 2023
127 of 129 checks passed

amanasifkhalid deleted the bbjump branch October 18, 2023 23:26

amanasifkhalid mentioned this pull request Oct 24, 2023

JIT: Allow initializing blocks without jump targets #93928

Merged

ghost locked as resolved and limited conversation to collaborators Nov 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: Set bbJumpKind and bbJumpDest during block initialization #93415

JIT: Set bbJumpKind and bbJumpDest during block initialization #93415

amanasifkhalid commented Oct 12, 2023

ghost commented Oct 12, 2023

amanasifkhalid commented Oct 12, 2023

amanasifkhalid commented Oct 16, 2023 •

edited

Loading

amanasifkhalid commented Oct 16, 2023

amanasifkhalid commented Oct 17, 2023 •

edited

Loading

AndyAyersMS left a comment

AndyAyersMS Oct 17, 2023

amanasifkhalid Oct 17, 2023

EgorBo commented Oct 17, 2023

AndyAyersMS commented Oct 17, 2023

amanasifkhalid commented Oct 17, 2023

AndyAyersMS commented Oct 17, 2023 •

edited

Loading

AndyAyersMS left a comment

amanasifkhalid commented Oct 18, 2023 •

edited

Loading

amanasifkhalid commented Oct 18, 2023 •

edited

Loading

AndyAyersMS commented Oct 18, 2023

amanasifkhalid commented Oct 18, 2023

jakobbotsch commented Oct 23, 2023

amanasifkhalid commented Oct 23, 2023 •

edited

Loading

AndyAyersMS commented Oct 23, 2023

jakobbotsch commented Oct 23, 2023 •

edited

Loading

amanasifkhalid commented Oct 23, 2023 •

edited

Loading

JIT: Set bbJumpKind and bbJumpDest during block initialization #93415

JIT: Set bbJumpKind and bbJumpDest during block initialization #93415

Conversation

amanasifkhalid commented Oct 12, 2023

ghost commented Oct 12, 2023

amanasifkhalid commented Oct 12, 2023

amanasifkhalid commented Oct 16, 2023 • edited Loading

amanasifkhalid commented Oct 16, 2023

amanasifkhalid commented Oct 17, 2023 • edited Loading

AndyAyersMS left a comment

Choose a reason for hiding this comment

AndyAyersMS Oct 17, 2023

Choose a reason for hiding this comment

amanasifkhalid Oct 17, 2023

Choose a reason for hiding this comment

EgorBo commented Oct 17, 2023

AndyAyersMS commented Oct 17, 2023

amanasifkhalid commented Oct 17, 2023

AndyAyersMS commented Oct 17, 2023 • edited Loading

AndyAyersMS left a comment

Choose a reason for hiding this comment

amanasifkhalid commented Oct 18, 2023 • edited Loading

amanasifkhalid commented Oct 18, 2023 • edited Loading

AndyAyersMS commented Oct 18, 2023

amanasifkhalid commented Oct 18, 2023

jakobbotsch commented Oct 23, 2023

amanasifkhalid commented Oct 23, 2023 • edited Loading

AndyAyersMS commented Oct 23, 2023

jakobbotsch commented Oct 23, 2023 • edited Loading

amanasifkhalid commented Oct 23, 2023 • edited Loading

amanasifkhalid commented Oct 16, 2023 •

edited

Loading

amanasifkhalid commented Oct 17, 2023 •

edited

Loading

AndyAyersMS commented Oct 17, 2023 •

edited

Loading

amanasifkhalid commented Oct 18, 2023 •

edited

Loading

amanasifkhalid commented Oct 18, 2023 •

edited

Loading

amanasifkhalid commented Oct 23, 2023 •

edited

Loading

jakobbotsch commented Oct 23, 2023 •

edited

Loading

amanasifkhalid commented Oct 23, 2023 •

edited

Loading