Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guarded devirtualization: multiple type checks #86551

Merged
merged 18 commits into from
May 26, 2023

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented May 21, 2023

Contributes to #86769 and #86235
This PR builds a basic infrastructure to enable multiple GDV checks (for all runtimes), but for now it's only enabled for NativeAOT:

Demo:

using System.Runtime.CompilerServices;

public interface IValue
{
    int GetValue();
}

public class MyClass1 : IValue
{
    public int GetValue() => 10;
}

public class MyClass2 : IValue
{
    public int GetValue() => 50;
}

public class MyClass3 : IValue
{
    public int GetValue() => 100;
}

public class Program
{

    public static void Main(string[] args)
    {
        Test(new MyClass1());
        Test(new MyClass2());
        Test(new MyClass3());
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    static int Test(IValue value) => value.GetValue();
}

NativeAOT codegen for Test (without static pgo):

; Assembly listing for method Program:Test(IValue):int
; Method Program:Test(IValue):int
G_M7592_IG01:  
       sub      rsp, 40
G_M7592_IG02:  
       mov      rax, qword ptr [rcx]
       lea      r10, [(reloc 0x4000000000421b48)] ;; MyClass1
       cmp      rax, r10
       jne      SHORT G_M7592_IG04
G_M7592_IG03:  
       mov      eax, 10
       jmp      SHORT G_M7592_IG08
G_M7592_IG04:  
       lea      r10, [(reloc 0x4000000000421b58)] ;; MyClass2
       cmp      rax, r10
       jne      SHORT G_M7592_IG06
G_M7592_IG05:  
       mov      eax, 50
       jmp      SHORT G_M7592_IG08
G_M7592_IG06: 
       lea      r10, [(reloc 0x4000000000421b68)] ;; MyClass3
       cmp      rax, r10
       jne      SHORT G_M7592_IG09
G_M7592_IG07:  
       mov      eax, 100
G_M7592_IG08:  
       add      rsp, 40
       ret     
G_M7592_IG09:  
       lea      r10, [(reloc 0x4000000000421af8)]
       call     [r10]IValue:GetValue():int:this
       jmp      SHORT G_M7592_IG08
; Total bytes of code: 79

@EgorBo EgorBo marked this pull request as draft May 21, 2023 13:27
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 21, 2023
@ghost ghost assigned EgorBo May 21, 2023
@ghost
Copy link

ghost commented May 21, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

This PR builds a basic infrastructure to enable multiple GDV checks (for all runtimes), but for now it's only enabled for NativeAOT where for daul-morphic interface calls we can devirtualize fallback too, e.g.:

public interface IValue
{
    int GetValue();
}

public class MyClass1 : IValue
{
    public int GetValue() => 10;
}

public class MyClass2 : IValue
{
    public int GetValue() => 100;
}


static int Test(IValue val) => val.GetValue();

Old NativeAOT codegen for Test():

; Assembly listing for method Program:Test(IValue):int
       sub      rsp, 40
       lea      r10, [(reloc 0x4000000000420060)]
       call     [r10]IValue:GetValue():int:this  ;; interface call, not devirtualized
       nop      
       add      rsp, 40
       ret      
; Total bytes of code 20

New codegen:

; Assembly listing for method Program:Test(IValue):int
       lea      rax, [(reloc 0x4000000000420fe0)]
       mov      edx, 100
       mov      r8d, 10
       cmp      qword ptr [rcx], rax
       mov      eax, r8d
       cmovne   eax, edx
       ret      
; Total bytes of code 28

(cmove for 10 or 100 based on type check for val object)

Should not be difficult to enable for JIT and multiple type checks (but will need some work around chaining)

Author: EgorBo
Assignees: EgorBo
Labels:

area-CodeGen-coreclr

Milestone: -

@MichalStrehovsky
Copy link
Member

Out of curiosity - what will be the interaction between this and PGO data? If PGO says the likely class is X and whole program analysis says the possibilities are X and Y. Would we trust PGO or the overapproximation from whole program analysis?

It's possible this question doesn't make much difference for 2 cases, but if we have PGO say X and whole program analysis say X, Y, Z, U, V, W, maybe we can trust PGO.

@EgorBo
Copy link
Member Author

EgorBo commented May 22, 2023

Out of curiosity - what will be the interaction between this and PGO data? If PGO says the likely class is X and whole program analysis says the possibilities are X and Y. Would we trust PGO or the overapproximation from whole program analysis?

It's possible this question doesn't make much difference for 2 cases, but if we have PGO say X and whole program analysis say X, Y, Z, U, V, W, maybe we can trust PGO.

I think we can consult with PGO here yes, but, presumably, it's a rare case for AOT now because Static PGO is quite complicated to setup and the process is not documented (and tested) well.

@EgorBo
Copy link
Member Author

EgorBo commented May 22, 2023

/azp list

@azure-pipelines

This comment was marked as resolved.

@EgorBo
Copy link
Member Author

EgorBo commented May 22, 2023

/azp run runtime-extra-platforms

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MichalStrehovsky
Copy link
Member

I think we can consult with PGO here yes, but, presumably, it's a rare case for AOT now because Static PGO is quite complicated to setup and the process is not documented (and tested) well.

We have PGO data that we ship with the product - one of the things it's trained on is ASP.NET. So this could be relevant for ASP.NET scenarios and doesn't require the complicated setup there because it comes with the compiler.

@EgorBo
Copy link
Member Author

EgorBo commented May 23, 2023

We have PGO data that we ship with the product - one of the things it's trained on is ASP.NET. So this could be relevant for ASP.NET scenarios and doesn't require the complicated setup there because it comes with the compiler.

I plan to work on this in iterations when I have time so definitely something I'll consider. For now I changed the logic to rely on PGO if it exists, otherwise - go the exact classes list.

Also, I completely rewrote the PR so now it has a full-fledged multiple-candidates support (controlled via JitGuardedDevirtualizationMaxTypeChecks (up to 5 type checks - can be extended, but, presumably, makes no sense since e.g. 5th check will end up to be expensive due to previous failed checks).

@EgorBo EgorBo changed the title Guarded devirtualization: devirtualize dual-morphic interface calls on NativeAOT Guarded devirtualization: multiple type checks May 23, 2023
@EgorBo
Copy link
Member Author

EgorBo commented May 23, 2023

/azp run runtime-extra-platforms

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).


elseBlock->setBBProfileWeight(newWeight);
elseBlock->inheritWeightPercentage(currBlock, elseLikelihood);
Copy link
Member

@AndyAyersMS AndyAyersMS May 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the multi-guess exact case this likelihood should end up at zero, right? We think we know all the possibilities.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the logic is shared with non-exact case where fallback may have non-zero likelihood (Dynamic PGO)

I plan to enable "no fallback needed" separately, I am thinking of doing something like "we're importing the last type-check and the call has GTF_M_EXACT_GDV flag --> convert the last pair of check-then to elseBock.

@EgorBo
Copy link
Member Author

EgorBo commented May 25, 2023

Thanks! I still plan to address some of your comments as part of this PR so will ping you again once I'm done🙂 Meanwhile, I filed #86769 to record the ideas

@EgorBo
Copy link
Member Author

EgorBo commented May 26, 2023

/azp run runtime-extra-platforms, runtime-coreclr pgo, runtime-coreclr pgostress

@azure-pipelines
Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@EgorBo
Copy link
Member Author

EgorBo commented May 26, 2023

@AndyAyersMS I've addressed your feedback -
#86551 (comment)
and
#86551 (comment)
so should be ready now, outerloop job also look good

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM.

IIRC 3 is a common value for the number of checks in these sorts of things, once you get past that then (assuming you are testing candidates in decreasing order of likelihood) the cost of that many (possibly mis-predicted) branches starts to overwhelm the benefit.

@EgorBo
Copy link
Member Author

EgorBo commented May 26, 2023

Outerloop failures are #84911, #76454

@EgorBo EgorBo merged commit 182b013 into dotnet:main May 26, 2023
@EgorBo EgorBo deleted the multiple-gdv-naot branch May 26, 2023 18:47
@jakobbotsch
Copy link
Member

The MinOpts TP impact of this change seems quite significant, is it expected? Also lots of misses, so hard to know how accurate those diffs are, is it possible to recollect the diffs on the new collection now?

@kunalspathak
Copy link
Member

The MinOpts TP impact of this change seems quite significant, is it expected?

Agree. Just noticed at the TP numbers.

@EgorBo
Copy link
Member Author

EgorBo commented May 31, 2023

The MinOpts TP impact of this change seems quite significant, is it expected? Also lots of misses, so hard to know how accurate those diffs are, is it possible to recollect the diffs on the new collection now?

Also lots of misses

Yes, because I call a new JIT-EE API for each virtual call (getExactClasses) which is no-op for non-NativeAOT, to remove the misses I did locally run with if (isNativeAot kind of check, just didn't want to check it in.

The 0.2% MinOpts regression comes from unexpected calls to impMarkInlineCandidate in MinOpts, I'll check where they're coming from. Non-minopts diffs are around noise

@EgorBo
Copy link
Member Author

EgorBo commented May 31, 2023

Mitigated via #86949

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants