A valuable technique when developing the JIT is generating assembly code output with a baseline compiler as well as with a compiler that has changes -- the "diff compiler" -- and examining the generated code differences between the two. The tools described here automate that process.
This guide assumes that you have built CoreCLR. See the CoreCLR GitHub repo for directions on building.
- jit-diff - Driver tool that implements a common developer work flow. Allows specifying a configuration file to store common defaults, and implements a directory scheme for "installing" tools for use later.
- jit-dasm - Produce
*.dasm
from a compiler for an assembly or set of assemblies via prejitting. Used by jit-diff to do its work. - jit-dasm-pmi - like
jit-dasm
but allows you to look at jitted code instead of prejitted code. - jit-analyze - Compare and analyze
*.dasm
files from baseline/diff. Produces a report on diffs, total size regression/improvement, and size regression/improvement by file and method. - jit-tp-analyze - Compare trace files with per-function instruction counts from baseline/diff. See jit-tp-analyze.md.
- dotnet - The 2.1 dotnet CLI is used to build the tools. It is also used to determine the current processor architecture "RID". Install it from here.
- git - The jit-analyze tool uses
git diff
to check for textual differences since this is consistent across platforms, and fast. It is also used to determine if the current directory is a dotnet/coreclr repo root, to provide for default arguments.
A bootstrap.{cmd,sh}
script is provided in the jitutils root directory which will validate all tool dependencies,
build the repo, publish the resulting binaries to a common "bin" directory, and place them on the path.
This can be run to set up the developer in one shot.
To build jitutils not using the bootstrap script:
- Run
dotnet restore
in the root (once). - Run
build.{cmd,sh}
.
By default the script just builds the tools and does not publish them in a separate directory. To publish the utilities add the '-p' flag which publishes each utility to the ./bin directory in the root of the repo. Additionally, to download the default set of framework assemblies that can be used for generating asm diffs, add '-f'.
$ ./build.sh -h
build.sh [-b <BUILD TYPE>] [-f] [-h] [-p] [-t <TARGET>]
-b <BUILD TYPE> : Build type, can be Debug or Release.
-h : Show this message.
-p : Publish utilities.
By default, assembly code output (aka, "dasm") is generated by running crossgen with a specified JIT to compile a specified set of assemblies, by setting the following JIT environment variables to generate the output:
DOTNET_JitDisasm
DOTNET_JitUnwindDump
DOTNET_JitEHDump
DOTNET_JitDisasmDiffable
- optionally,
DOTNET_JitGCDump
Generating "diffs" involves generating assembly code output for both a baseline and a "diff" JIT, and comparing the results.
Passing the --pmi
option to jit-diffs
will instead use reflection to jit each
method in the assembly, setting these options:
DOTNET_JitDisasm
DOTNET_JitUnwindDump
DOTNET_JitEHDump
DOTNET_JitDisasmDiffable
DOTNET_JitDisasmAssemblies
- optionally,
DOTNET_JitGCDump
jit-diff has built-in knowledge of how to generate asm diffs for the following sets of assemblies:
- System.Private.CoreLib.dll. Use
-c
or--corelib
. - A set of about 130 .NET Core frameworks assemblies (including System.Private.CoreLib.dll).
Use
-f
or--frameworks
. - The entire dotnet/coreclr test tree. Use
--tests
. - Of the test tree, only the benchmarks. Use
--benchmarks
. - An arbitrary assembly via
--assembly
.
--corelib
is the default.
Only one of --corelib
or --frameworks
can be specified.
Only one of --tests
or --benchmarks
can be specified.
If --tests
or --benchmarks
is specified, you may also specify --test_root
so the tool knows
where to find the test tree you wish to use, or use the computed default for --test_root
.
To generate diffs for everything jit-diff knows how to diff, use both --frameworks
and --tests
.
jit-dasm
and jit-dasm-pmi
provide complementary views of the impact of a jit
change on generated code. For a robust assessment of the impact of a jit change
you may need to run diffs both ways.
The jit will produce different code when jitting than when prejitting and will produce code for somewhat different sets of methods.
- Prejitted code interacts with the runtime differently, and has restrictions on cross assembly inlining.
- Generic methods and methods defined in generic types may not get prejitted.
- Methods using SIMD (
Vector<T>
) and methods using hardware intrinsics will not be prejitted. - Methods using generic types or methods from other assemblies may not get prejitted.
- Methods that are prejitted generally won't be jitted.
jit-dasm-pmi
uses heuristics to try and cover generic types and methods that give a basic but simplistic view of possible instantations.
First, you must build the dotnet/coreclr repo to produce a crossgen and a JIT (e.g., clrjit.dll). You also need to have a baseline crossgen / JIT available. One way to do this is to have a separate clone of the dotnet/coreclr repro that is identical to your working / "diff" clone, except that it has no changes made to it. For example, you might have these two directories:
c:\coreclr
- main development directory; clone of dotnet/coreclr. This is where you work.c:\coreclr_base
- a "baseline" clone of dotnet/coreclr that matches the source of your main development directory except for your experimental changes to the JIT.
Build both of these directories with the same architecture and build flavor (e.g., x64 checked), producing the compilers to compare.
Also, if you want to generate diffs using the assemblies in the test tree, build the tests in the
"diff" tree (e.g., in the example above, c:\coreclr
).
Ensure the jitutils tools are built, and jit-diff, jit-analyze, and jit-dasm are on the path.
jit-diff has three top-level commands, as shown by the help message:
$ jit-diff --help
usage: jit-diff <command> [<args>]
diff Run asm diff.
list List defaults and available tools in config.json.
install Install tool in config.json.
uninstall Uninstall tool from config.json.
The "jit-diff diff" command has this help message:
usage: jit-diff diff [-b [arg]] [-d [arg]] [--crossgen <arg>] [-o <arg>] [--noanalyze] [-s]
[-t <arg>] [-c] [-f] [--benchmarks] [--tests] [--gcinfo] [-v] [--core_root <arg>]
[--test_root <arg>] [--base_root <arg>] [--diff_root <arg>] [--arch <arg>]
[--build <arg>] [--altjit <arg>]
-b, --base [arg] The base compiler directory or tag. Will use crossgen or clrjit from this
directory.
-d, --diff [arg] The diff compiler directory or tag. Will use crossgen or clrjit from this
directory.
--crossgen <arg> The crossgen or crossgen2 compiler exe. When this is specified,
will use clrjit from the --base and --diff directories with this crossgen.
-o, --output <arg> The output path.
--noanalyze Do not analyze resulting base, diff dasm directories. (By default, the
directories are analyzed for diffs.)
-s, --sequential Run sequentially; don't do parallel compiles.
-t, --tag <arg> Name of root in output directory. Allows for many sets of output.
-c, --corelib Diff System.Private.CoreLib.dll.
-f, --frameworks Diff frameworks.
--benchmarks Diff core benchmarks.
--tests Diff all tests.
--gcinfo Add GC info to the disasm output.
-v, --verbose Enable verbose output.
--core_root <arg> Path to test CORE_ROOT.
--test_root <arg> Path to test tree. Use with --benchmarks or --tests.
--base_root <arg> Path to root of base dotnet/coreclr repo.
--diff_root <arg> Path to root of diff dotnet/coreclr repo.
--arch <arg> Architecture to diff (x86, x64).
--build <arg> Build flavor to diff (Checked, Debug).
--altjit <arg> If set, the name of the altjit to use (e.g., clrjit_win_arm64_x64.dll).
--pmi Generate diffs via jitting instead of running crossgen
--assembly <arg> Look at diffs for methods in the specified assembly
Examples:
jit-diff diff --output c:\diffs --corelib --core_root c:\coreclr\bin\tests\windows.x64.Release\Tests\Core_Root --base c:\coreclr_base\bin\Product
\windows.x64.Checked --diff c:\coreclr\bin\Product\windows.x86.Checked
Generate diffs of System.Private.CoreLib.dll by specifying baseline and
diff compiler directories explicitly.
jit-diff diff --output c:\diffs --base c:\coreclr_base\bin\Product\windows.x64.Checked --diff
If run within the c:\coreclr git clone of dotnet/coreclr, does the same
as the prevous example, using defaults.
jit-diff diff --output c:\diffs --base --base_root c:\coreclr_base --diff
Does the same as the prevous example, using -base_root to find the base
directory (if run from c:\coreclr tree).
jit-diff diff --base --diff
Does the same as the prevous example (if run from c:\coreclr tree), but uses
default c:\coreclr\bin\diffs output directory, and `base_root` must be specified
in the config.json file in the directory pointed to by the JIT_UTILS_ROOT
environment variable.
jit-diff diff --diff
Only generates asm using the diff JIT -- does not generate asm from a baseline compiler --
using all computed defaults.
jit-diff diff --diff --arch x86
Generate diffs, but for x86, even if there is an x64 compiler available.
jit-diff diff --diff --build Debug
Generate diffs, but using a Debug build, even if there is a Checked build available.
The "jit-diff list" command has this help message:
$ jit-diff list --help
usage: jit-diff list [-v]
-v, --verbose Enable verbose output
The "jit-diff install" command has this help message:
$ jit-diff install --help
usage: jit-diff install [-j <arg>] [-n <arg>] [-b <arg>] [-v]
-j, --job <arg> Name of the job.
-n, --number <arg> Job number.
-b, --branch <arg> Name of branch.
-v, --verbose Enable verbose output
The "jit-diff uninstall" command has this help message:
$ jit-diff uninstall --help
usage: jit-diff uninstall [-t <arg>]
-t, --tag <arg> Name of tool tag in config file.
The tool needs to know:
- Which base and diff JIT and crossgen or corerun to use.
- Which assemblies to generate dasm for.
- Where to put the generated dasm.
- Whether or not you want diffs for prejitted code (default) or jitted code (via
--pmi
)
These can all be specified explicitly. For example:
c:\coreclr> jit-diff diff --output c:\diffs --corelib --core_root c:\coreclr\bin\tests\windows.x64.release\Tests\Core_Root --base e:\coreclr2\bin\Product\windows.x64.checked --diff c:\coreclr\bin\Product\windows.x64.checked --crossgen c:\coreclr\bin\Product\windows.x64.release
Explanation:
--output c:\diffs
-- specify the root directory where diffs will be placed.--corelib
-- generate diffs using System.Private.CoreLib.dll.--core_root
-- specify theCORE_ROOT
directory (the "test layout"). Used to specify to crossgen where the platform assemblies are. Also, used as the directory where framework assemblies such as System.Runtime.dll can be found for the purpose of using them to generate dasm.--base
-- specify the directory in which a baseline JIT can be found.--diff
-- specify the directory in which a diff (experimental) JIT can be found.--crossgen
-- specify the crossgen.exe or crossgen2.exe to use. Note that this must match the build flavor of--core_root
.
You create the CORE_ROOT
directory "layout" by running the runtest script.
On Windows, this can be created by running the following in the dotnet/coreclr repo root.
c:\coreclr> tests\runtest.cmd
or
c:\coreclr> tests\runtest.cmd GenerateLayoutOnly
On non-Windows, consult the test instructions
here.
Note that you can pass --testDir=NONE
to runtest.sh to get the
same effect as passing GenerateLayoutOnly
to runtest.cmd on Windows.
The above jit-diff command will generate both baseline and diff asm code into the specified output directory.
It will automatically create a unique named subdirectory (if the --tag
switch isn't specified to override the default name),
and within that subdirectory will be "base" and "diff" directories, containing the diffs. The default subdirectory
name looks like dasmset_12
, with a unique number for every run of jit-diff.
jit-analyze will be run to compare the generated output if both baseline and diff asm are generated.
You can also run a recursive textual comparison tool like windiff on Windows to visually compare the diffs, e.g.:
c:\coreclr> windiff c:\diffs\dasmset_12\base c:\diffs\dasmset_12\diff
As seen above, specifying all required information can be quite verbose. jit-diff can automatically determine most of the arguments using computed defaults. The above diff can be accomplished using simply:
c:\coreclr> jit-diff diff --diff --base --base_root e:\coreclr2
You minimally specify:
--diff
with no argument to request the diff compiler be used to generate asm (to a "diff" directory).--base
with no argument to request the baseline compiler be used to generate asm (to a "base" directory).--base_root
to specify the dotnet/coreclr repo root that contains the baseline build.
The defaults are:
- If jit-diff is invoked with the current directory within the dotnet/coreclr repo, then the root of
this repo serves to find the diff compiler and
CORE_ROOT
directory. (Note that we have no reasonable default for determining what the baseline toolset or repo is. This is specified with the--base_root
argument or by providing a full path with the--base
argument.) - The default output directory is
<repo_root>\bin\diffs
(in this case, c:\coreclr\bin\diffs). - The default architecture is x64. If this isn't found, x86 is tried.
- The default diff and baseline JIT build flavor is checked. If this isn't found, debug is tried. (Both baseline and diff must be the same flavor.)
- By default, diffs are done using System.Private.CoreLib.dll. (That is,
--corelib
is the default.) - By default, a release build is used for
--core_root
and--test_root
. If not available, it falls back to checked or debug (but gives a warning that release is preferred).
To instead do diffs over the framework assemblies (not just System.Private.CoreLib.dll), using an x86 debug build, run:
c:\coreclr> jit-diff diff --base --diff --frameworks --arch x86 --build debug --base_root e:\coreclr2
To simplify this more, create a dotnet/coreclr repo clone that you will always use for baselines.
Create a configuration file that specifies your baseline root directory, specifying a base_root
default. See the document configuring defaults for details.
With a --base_root
default in the config.json file, you can simply run:
c:\coreclr> jit-diff diff --base --diff
to generate diffs using all the defaults.
If you only want to generate asm from the diff compiler, omit the --base
argument, e.g.:
c:\coreclr> jit-diff diff --diff
Similarly, only pass --base
to generate baselines (although in this case you also must specify
--base_root
so the tool can find the baseline).
The following command-line argument are used to adjust the defaults, such as specifying x86 diffs instead of x64 diffs. They are not otherwise required.
--base_root
--diff_root
--arch
--build
The various simplified jit-diff invocations above can also be used to invoke diffs
for jitted code by adding --pmi
as an additional argument. For example:
Analyze difference in jit codegen for methods in corelib:
c:\coreclr> jit-diff diff --pmi --diff --base --base_root e:\coreclr2
Or, disassemble the jitted code for all the methods in mytest.exe
:
c:\coreclr> jit-diff diff --pmi --diff --assembly mytest.exe
Note this latter run should produce similar disassembly as running mytest.exe
via
corerun (with appropriate DOTNET_ flags set) for the methods that are executed
during the run. But jit-diff diff -pmi
will attempt to show code generated for
all methods, executed or not. And it also works on libraries which are not
directly executable on their own. So PMI offers a potentially faster and more
comprehensive view of jit codegen.
jit-diff takes an optional '--tag' command-line argument. This tag can be used to label different
directories of *.dasm
in the output directory so multiple runs can be done.
This supports a scenario like the following:
- Build base CoreCLR
- Produce baseline diffs by invoking the tool with '--base'
- Make changes to CoreCLR JIT subdirectory to fix a bug.
- Produce tagged output by invoking
jit-diff --diff ... --tag bugfix1
- Make changes to CoreCLR JIT subdirectory to address review feedback/throughput issue.
- Produce tagged output by invoking
jit-diff --diff ... --tag reviewed1
- Address more review feedback in CoreCLR JIT.
- Produce tagged output by invoking
jit-diff --diff ... --tag reviewed_final
- ...
The jitutils suite includes the jit-analyze tool for analyzing diffs produced by the jit-diff/jit-dasm
utilities. It is automatically run, by default, when jit-diff diff --base --diff
is used.
jit-analyze cracks the generated baseline and diff *.dasm
files and computes the code size difference
between the two based on the output produced
by the JIT. This data is keyed by file and method name - for instance two files with
different names will not be compared even if passed as the base and diff since the tool is looking
to identify files missing from the base dataset versus the diff dataset.
For the simplest case just point the tool at a base and diff directory produce by jit-diff and it will summarize code size differences across the whole diff. This is what the jit-diff command lines in the previous section do.
On a significant set of diffs it will produce output like the following:
$ jit-analyze --base ~/Work/dotnet/output/base --diff ~/Work/dotnet/output/diff --recursive
Found files with textual diffs.
Summary:
(Note: Lower is better)
Total bytes of diff: -4124
diff is an improvement.
Top file regressions by size (bytes):
193 : Microsoft.CodeAnalysis.dasm
154 : System.Dynamic.Runtime.dasm
60 : System.IO.Compression.dasm
43 : System.Net.Security.dasm
43 : System.Xml.ReaderWriter.dasm
Top file improvements by size (bytes):
-1804 : mscorlib.dasm
-1532 : Microsoft.CodeAnalysis.CSharp.dasm
-726 : System.Xml.XmlDocument.dasm
-284 : System.Linq.Expressions.dasm
-239 : System.Net.Http.dasm
21 total files with size differences.
Top method regessions by size (bytes):
328 : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Syntax.InternalSyntax.DocumentationCommentXmlTokens:.cctor()
266 : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.MethodTypeInferrer:Fix(int,byref):bool:this
194 : mscorlib.dasm - System.DefaultBinder:BindToMethod(int,ref,byref,ref,ref,ref,byref):ref:this
187 : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Syntax.InternalSyntax.LanguageParser:ParseModifiers(ref):this
163 : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceAssemblySymbol:DecodeWellKnownAttribute(byref,int,bool):this
Top method improvements by size (bytes):
-160 : System.Xml.XmlDocument.dasm - System.Xml.XmlTextWriter:AutoComplete(int):this
-124 : System.Xml.XmlDocument.dasm - System.Xml.XmlTextWriter:WriteEndStartTag(bool):this
-110 : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.MemberSemanticModel:GetEnclosingBinder(ref,int):ref:this
-95 : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.CSharpDataFlowAnalysis:AnalyzeReadWrite():this
-85 : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Syntax.InternalSyntax.LanguageParser:ParseForStatement():ref:this
3762 total methods with size differences.
If --tsv <file_name>
or --json <file_name>
is passed, all the diff data extracted and analyzed
will be written out to the specified file for further analysis.
jit-analyze command line help:
$ jit-analyze --help
usage: jit-analyze [-b <arg>] [-d <arg>] [-r] [-c <arg>] [-w]
[--json <arg>] [--tsv <arg>]
-b, --base <arg> Base file or directory.
-d, --diff <arg> Diff file or directory.
-r, --recursive Search directories recursively.
-c, --count <arg> Count of files and methods (at most) to output
in the summary. (count) improvements and
(count) regressions of each will be included.
(default 5)
-w, --warn Generate warning output for files/methods that
only exists in one dataset or the other (only
in base or only in diff).
--json <arg> Dump analysis data to specified file in JSON
format.
--tsv <arg> Dump analysis data to specified file in
tab-separated format.
See the document configuring defaults for details on setting up a set of default configurations.
This is a general tool to produce assembly output via prejitting for compiled MSIL assemblies.
Sample help command line:
$ jit-dasm --help
usage: jit-dasm [--altjit <arg>] [-c <arg>] [-j <arg>] [-o <arg>]
[-f <arg>] [--gcinfo] [-v] [-r] [-p <arg>...] [--]
<assembly>...
--altjit <arg> If set, the name of the altjit to use
(e.g., clrjit_win_arm64_x64.dll).
-c, --crossgen <arg> The crossgen or crossgen2 compiler exe.
-j, --jit <arg> The full path to the jit library.
-o, --output <arg> The output path.
-f, --file <arg> Name of file to take list of assemblies
from. Both a file and assembly list can
be used.
--gcinfo Add GC info to the disasm output.
-v, --verbose Enable verbose output.
-r, --recursive Scan directories recursively.
-p, --platform <arg>... Path to platform assemblies
<assembly>... The list of assemblies or directories to
scan for assemblies.
This is a general tool to produce jitted assembly output for compiled MSIL assemblies.
Sample help command line:
$ jit-dasm-pmi --help
usage: jit-dasm-pmi [--altjit <arg>] [-c <arg>] [-j <arg>] [-o <arg>]
[-f <arg>] [--gcinfo] [-v] [-r] [-p <arg>...] [--]
<assembly>...
--altjit <arg> If set, the name of the altjit to use
(e.g., protononjit.dll).
-c, --corerun <arg> The corerun driver exe.
-j, --jit <arg> The full path to the jit library.
-o, --output <arg> The output path.
-f, --file <arg> Name of file to take list of assemblies
from. Both a file and assembly list can
be used.
--gcinfo Add GC info to the disasm output.
-v, --verbose Enable verbose output.
-r, --recursive Scan directories recursively.
-p, --platform <arg>... Path to platform assemblies
<assembly>... The list of assemblies or directories to
scan for assemblies.