Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jit disasm is confusing for R2R helpers with different entry points #69635

Closed
am11 opened this issue May 21, 2022 · 8 comments · Fixed by #76898
Closed

jit disasm is confusing for R2R helpers with different entry points #69635

am11 opened this issue May 21, 2022 · 8 comments · Fixed by #76898
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@am11
Copy link
Member

am11 commented May 21, 2022

Repro:

static class Foo
{
    public static int Bar { get; } = 1;
    static Foo() { } // even the empty cctor will force it to emit the helper
                     // call twice in each static property's getter.
}

crossgen'd with:

crossgen2 test.dll @refs.rsp --out test.r2r.dll \
    --codegenopt NgenDisasm='*' --codegenopt JitDiffableDasm=1 --parallelism 1 --optimize

emits the following asm for get_Bar():

; Assembly listing for method Foo:get_Bar():int
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Unix
; ReadyToRun compilation
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; Final local variable assignments
;
;# V00 OutArgs      [V00    ] (  1,  1   )  lclBlk ( 0) [rsp+00H]   "OutgoingArgSpace"
;
; Lcl frame size = 8

G_M39382_IG01:
       push     rax
                                                ;; size=1 bbWeight=1    PerfScore 1.00
G_M39382_IG02:
       call     [CORINFO_HELP_READYTORUN_STATIC_BASE]
       call     [CORINFO_HELP_READYTORUN_STATIC_BASE]
       mov      eax, dword ptr [rax+52]
                                                ;; size=15 bbWeight=1    PerfScore 8.00
G_M39382_IG03:
       add      rsp, 8
       ret      
                                                ;; size=5 bbWeight=1    PerfScore 1.25

; Total bytes of code 21, prolog size 1, PerfScore 12.35, instruction count 6, allocated bytes for code 21 (MethodHash=a7166629) for method Foo:get_Bar():int
; ============================================================

CORINFO_HELP_READYTORUN_STATIC_BASE is emitted twice. This does not happen in the absence of .cctor.

using runtime / CG2 build from commit: 810a7f9.

category:implementation
theme:ready-to-run
skill-level:beginner
cost:small
impact:small

@ghost ghost added the untriaged New issue has not been triaged by the area owner label May 21, 2022
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 21, 2022
@ghost
Copy link

ghost commented May 21, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

Code:

static class Foo
{
    public static int Bar { get; } = 1;
    static Foo() { } // even the empty cctor will force it to emit the
                     // helper twice for each static property's getter.
}

crossgen'd with:

crossgen2 test.dll @refs.rsp --out test.r2r.dll \
    --codegenopt NgenDisasm='*' --codegenopt JitDiffableDasm=1 --parallelism 1 --optimize

generates the following code for get_Bar():

; Assembly listing for method Foo:get_Bar():int
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Unix
; ReadyToRun compilation
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; Final local variable assignments
;
;# V00 OutArgs      [V00    ] (  1,  1   )  lclBlk ( 0) [rsp+00H]   "OutgoingArgSpace"
;
; Lcl frame size = 8

G_M39382_IG01:
       push     rax
                                                ;; size=1 bbWeight=1    PerfScore 1.00
G_M39382_IG02:
       call     [CORINFO_HELP_READYTORUN_STATIC_BASE]
       call     [CORINFO_HELP_READYTORUN_STATIC_BASE]
       mov      eax, dword ptr [rax+52]
                                                ;; size=15 bbWeight=1    PerfScore 8.00
G_M39382_IG03:
       add      rsp, 8
       ret      
                                                ;; size=5 bbWeight=1    PerfScore 1.25

; Total bytes of code 21, prolog size 1, PerfScore 12.35, instruction count 6, allocated bytes for code 21 (MethodHash=a7166629) for method Foo:get_Bar():int
; ============================================================

CORINFO_HELP_READYTORUN_STATIC_BASE is emitted twice. This does not happen in the absence of .cctor.

using runtime / CG2 build from commit: 810a7f9.

Author: am11
Assignees: -
Labels:

area-CodeGen-coreclr, untriaged

Milestone: -

@teo-tsirpanis teo-tsirpanis added area-ReadyToRun-coreclr and removed area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels May 21, 2022
@jakobbotsch
Copy link
Member

jakobbotsch commented May 22, 2022

Under the hood crossgen2 creates two different entry points for this helper that, to the JIT, is named the same thing. Internally in R2R the first helper call is a CctorTrigger helper call, while the second is a GetNonGCStaticBase helper call. It is more visible if you use r2rdump (which is broken right now, opened #69655):

int Program+Foo.get_Bar()
Id: 2
StartAddress: 0x000017B0
Size: 24 bytes
UnwindRVA: 0x000016F4
Version:            1
Flags:              0x03 EHANDLER UHANDLER
SizeOfProlog:       0x0004
CountOfUnwindCodes: 1
FrameRegister:      None
FrameOffset:        0x0
PersonalityRVA:     0x17E8
UnwindCode[0]: CodeOffset 0x0004 FrameOffset 0x0000 NextOffset 0x0 Op 40

Debug Info
    Bounds:
    Native Offset: 0x0, Prolog, Source Types: StackEmpty
    Native Offset: 0xA, IL Offset: 0x0000, Source Types: StackEmpty
    Native Offset: 0x13, Epilog, Source Types: StackEmpty

17b0: 48 83 ec 28               sub     rsp, 40
                                UWOP_ALLOC_SMALL 40                                0

17b4: ff 15 e6 18 00 00         call    qword ptr [0x30a0]    // Program+Foo (CCTOR_TRIGGER)
17ba: ff 15 e8 18 00 00         call    qword ptr [0x30a8]    // Program+Foo (STATIC_BASE_NON_GC)
17c0: 8b 40 34                  mov     eax, dword ptr [rax + 52]
17c3: 48 83 c4 28               add     rsp, 40
17c7: c3                        ret

Since these have different entry points the JIT does not e.g. CSE them.

@am11
Copy link
Member Author

am11 commented May 23, 2022

@jakobbotsch, thank you for the explanation and fixing r2rdump.

  • Do we want to continue to call CCTOR_TRIGGER on each static property's get method in this case?
  • Do we want to disambiguate CORINFO_HELP_READYTORUN_STATIC_BASE_NON_GC and CORINFO_HELP_READYTORUN_CCTOR_TRIGGER in disassmembly? (the command I used was extracted from godbolt, so it might be interesting for user if this distinction is visible in the output: https://godbolt.org/z/dcxbsTcn5)

@trylek
Copy link
Member

trylek commented Jun 15, 2022

I think there are several aspects to consider:

  • In general, AFAIK CCTOR_TRIGGER needs to be called by all static methods to guarantee the static constructor has run for the type (as opposed to instance methods where the execution of static constructor is guaranteed by the execution of the instance constructor).

  • We may be able to elide calls to the static constructor if we can prove that the type has empty static constructor. I guess that is what @am11 is alluding to. At the first glance that should be easy but sometimes these optimizations have subtle catches that are hard to see so it would require some experimentation to be sure if it's reasonably easy to do.

  • R2RDump uses its own logic for decoding import cells that lets it disambiguate between the above two calls; according to my understanding the above disassembly quoted by Adeel comes from a different implementation in JIT. Disambiguating the two calls in the JIT dump sounds like a good idea to me but I guess it's up to @dotnet/jit-contrib to make the call.

@trylek trylek removed the untriaged New issue has not been triaged by the area owner label Jun 15, 2022
@trylek trylek added this to the 7.0.0 milestone Jun 15, 2022
@jakobbotsch
Copy link
Member

jakobbotsch commented Jul 15, 2022

  • Do we want to disambiguate CORINFO_HELP_READYTORUN_STATIC_BASE_NON_GC and CORINFO_HELP_READYTORUN_CCTOR_TRIGGER in disassmembly? (the command I used was extracted from godbolt, so it might be interesting for user if this distinction is visible in the output: https://godbolt.org/z/dcxbsTcn5)

The only thing that disambiguates these two calls to the JIT is the fact that crossgen2 returned different entry points for them. This is an R2R-only thing, for runtime JIT the entry point is uniquely identified from the helper name.
Since I don't think we want to introduce new JIT/R2R helpers just for cosmetic reasons the only disambiguating thing I think we could do is have the JIT print the entry point addresses if we notice the same helper being used with different entry points.

@jakobbotsch jakobbotsch changed the title CORINFO_HELP_READYTORUN_STATIC_BASE is emitted twice in some cases jit disasm is confusing for R2R helpers with different entry points Jul 15, 2022
@jakobbotsch jakobbotsch added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI and removed area-ReadyToRun-coreclr labels Jul 15, 2022
@jakobbotsch jakobbotsch modified the milestones: 7.0.0, 8.0.0 Jul 15, 2022
@ghost
Copy link

ghost commented Jul 15, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Repro:

static class Foo
{
    public static int Bar { get; } = 1;
    static Foo() { } // even the empty cctor will force it to emit the helper
                     // call twice in each static property's getter.
}

crossgen'd with:

crossgen2 test.dll @refs.rsp --out test.r2r.dll \
    --codegenopt NgenDisasm='*' --codegenopt JitDiffableDasm=1 --parallelism 1 --optimize

emits the following asm for get_Bar():

; Assembly listing for method Foo:get_Bar():int
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Unix
; ReadyToRun compilation
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; Final local variable assignments
;
;# V00 OutArgs      [V00    ] (  1,  1   )  lclBlk ( 0) [rsp+00H]   "OutgoingArgSpace"
;
; Lcl frame size = 8

G_M39382_IG01:
       push     rax
                                                ;; size=1 bbWeight=1    PerfScore 1.00
G_M39382_IG02:
       call     [CORINFO_HELP_READYTORUN_STATIC_BASE]
       call     [CORINFO_HELP_READYTORUN_STATIC_BASE]
       mov      eax, dword ptr [rax+52]
                                                ;; size=15 bbWeight=1    PerfScore 8.00
G_M39382_IG03:
       add      rsp, 8
       ret      
                                                ;; size=5 bbWeight=1    PerfScore 1.25

; Total bytes of code 21, prolog size 1, PerfScore 12.35, instruction count 6, allocated bytes for code 21 (MethodHash=a7166629) for method Foo:get_Bar():int
; ============================================================

CORINFO_HELP_READYTORUN_STATIC_BASE is emitted twice. This does not happen in the absence of .cctor.

using runtime / CG2 build from commit: 810a7f9.

Author: am11
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: 7.0.0

@am11
Copy link
Member Author

am11 commented Jul 15, 2022

Sounds reasonable. I have just noticed that with NativeAOT, it only calls the helper once:

# dotnet7 publish -o dist --use-current-runtime -c Release -p:PublishAot=true
$ gdb dist/nativelib1.so -batch -ex 'set disassembly-flavor intel' -ex 'disassemble nativelib1_Foo__get_Bar'
Dump of assembler code for function nativelib1_Foo__get_Bar:
   0x0000000000141f80 <+0>:	push   rax
   0x0000000000141f81 <+1>:	call   0xcb03c <__GetNonGCStaticBase_nativelib1_Foo>
   0x0000000000141f86 <+6>:	mov    eax,DWORD PTR [rax]
   0x0000000000141f88 <+8>:	add    rsp,0x8
   0x0000000000141f8c <+12>:	ret    
End of assembler dump.

@jakobbotsch
Copy link
Member

NAOT might have evaluated the static constructor at compile time since it is trivial.

@BrianBohe BrianBohe linked a pull request Dec 10, 2022 that will close this issue
@ghost ghost locked as resolved and limited conversation to collaborators Jan 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants