-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Set bbJumpKind and bbJumpDest during block initialization #93415
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsFollowup to #93152. This refactor enforces new invariants on BasicBlock's bbJumpKind and bbJumpDest. In particular, whenever bbJumpKind is a kind that must have a jump target, bbJumpDest must be set, else bbJumpDest must be null. This means bbJumpKind and bbJumpDest must be simultaneously initialized/updated when creating/converting a jump block; previously, we initialized blocks with their jump kind specified, and later set their jump targets accordingly.
|
I'm waiting to rebase on top of #93377. Then I'll open for review. |
49846ea
to
a94c0a6
Compare
No asmdiffs. TP diffs on Linux x64 and Windows x86. The regressions for the latter seem a bit dramatic to be attributed to the additional |
superpmi-replay failure is #93527 |
cc @dotnet/jit-contrib , @AndyAyersMS PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks good.
For throughput, it would be good to do the next level of investigation. I can't remember if we do these lab TP runs with PGO-enhanced release builds, but if we do it's also possible the PGO data we have becomes stale with your changes. You can try and build non-pgo baseline and diff locally and run your own TP diffs perhaps.
There are some pin
based tools you can try and use to pin down where the extra time is going. Let me see if I can dig up a pointer to this for you (or maybe someone else on @dotnet/jit-contrib can do this quicker than I can).
Thought for possible follow-up: I wonder if there is some better way to set a "temporary" jump target that ensures that later on we actually have remembered to change it.
callBlock->SetJumpKind(BBJ_CALLFINALLY DEBUG_ARG(this)); // convert the BBJ_LEAVE to BBJ_CALLFINALLY | ||
|
||
assert(callBlock->HasJump()); | ||
fgRemoveRefPred(callBlock->GetJumpDest(), callBlock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes to impImportLeave
always make me nervous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise -- this was the trickiest of my changes to get right. Most of my debug time was spent on one failing method in the CoreCLR tests supermi collection that exercised an unusual path (the bug in that case was caused by initializing new blocks with jumps as BBJ_NONE while they don't have a jump target, thus triggering a different path in fgNewBBInRegion
; changing this approach to use the actual jump kind and a temporary jump target fixed that).
We disable native PGO for these runs so it should be ok, It's interesting that TP is actually improved for MSVC ones. |
I would look at the windows x86 results in particular, those should be the easiest to investigate. Note that the "coreclr tests" collection includes a lot of unusual code, so it may just be one particular code shape that's causing this. |
How about some public static BasicBlock member for this that we can pass to |
One idea to explore is to just allocate some special block (call it However we'd need to be careful that this block doesn't complicate life elsewhere. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bbTempJumpDest
seems like a nice option here. Thanks.
Any insight into the TP impact yet?
Not much yet. I've identified the method contexts in the CoreCLR tests collection with the largest TP diffs when targeting x86, but I don't know of any way to find the actual methods/tests corresponding to those contexts. I see under @AndyAyersMS do you have any suggestions for what tooling might be useful here? Thank you! Edit: @jakobbotsch shared a pin tool with me that can profile JIT functions. Trying it now... |
@AndyAyersMS So it looks like about 30% of the instruction diff on Windows x86 comes from calls to As for Also, the diff in total instructions executed is only +0.09% for me when replaying the CoreCLR tests collection locally. Not sure why it's so much bigger on Helix machines... |
Thanks for digging in. I'm ok with taking this change as is, since the TP diff is 32 bit and only in |
superpmi-replay failure is #93527. |
I just came across It does raise another question: if we actively have to fight the checking, then what is the benefit of having it? It seems like it just creates usability problems. I do not recall seeing us having bugs previously around "half baked flow graphs", and I would much rather allow us to build flow graph structures naturally and then only check them once they are completed (e.g. at the end of the phase, as a normal post-phase check). |
I enabled
I initially took this approach, but this introduced subtle bugs in a few places. For example, in I'm also cognizant of the fact that we plan to either significantly restrict our usage of
I certainly agree with you that these weird cases aren't elegant with my recent changes. I take solace in the fact that these cases are the exception, and not the norm -- if I recall correctly, we're only using @AndyAyersMS what do you think? |
One of the hopes here is that we can eventually automate most of the ref count and pred list maintenance. So it seems preferable to not to create something with implicitly incorrect flow and then fix it up; instead we should have some transient "incomplete" state with obviously incorrect flow that must resolved.
Jakob would you find this more palatable? It would not be debug-only, would not be allowed at end of phase, and most of the code that checks block kinds should already have default/error cases for unexpected kinds, so perhaps this would be more robust? |
Maybe I was just unlucky, but when merging I ended up in one of these cases. I like the I do think a more ideal solution would be to separate the notion of "building" from the notion of "adding". Building new blocks should not have any checks, while the checking/state computations can be made when adding the result of the builder to the flow graph. Not sure how easy it is to design such a graph builder class that can support all of our patterns, though. |
I think so. I imagine we'd only use
Maybe we could have Edit: On second thought, I don't think |
...and remove BasicBlock::bbTempJumpDest, per discussion in #93415. We still assert the jump target is set appropriately whenever it is read/written, and in the majority of cases, we still initialize blocks with their jump kind and target set simultaneously. This change improves usability for the few edge cases (like in Compiler::impImportLeave) where a block's jump target isn't known at initialization.
...and remove BasicBlock::bbTempJumpDest, per discussion in dotnet#93415. We still assert the jump target is set appropriately whenever it is read/written, and in the majority of cases, we still initialize blocks with their jump kind and target set simultaneously. This change improves usability for the few edge cases (like in Compiler::impImportLeave) where a block's jump target isn't known at initialization.
Followup to #93152. This refactor enforces new invariants on BasicBlock's bbJumpKind and bbJumpDest. In particular, whenever bbJumpKind is a kind that must have a jump target, bbJumpDest must be set, else bbJumpDest must be null. This means bbJumpKind and bbJumpDest must be simultaneously initialized/updated when creating/converting a jump block; previously, we initialized blocks with their jump kind specified, and later set their jump targets accordingly.