Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable riscv64 R2R in installer #96941

Merged
merged 10 commits into from
Mar 4, 2024
Merged

Conversation

am11
Copy link
Member

@am11 am11 commented Jan 13, 2024

Linker (cecil) was running into infinite recursion last week. Testing / trying to figure out what's missing.

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Jan 13, 2024
@am11 am11 added area-ReadyToRun-coreclr arch-riscv Related to the RISC-V architecture and removed area-Setup community-contribution Indicates that the PR has been added by a community member labels Jan 13, 2024
@jkotas jkotas changed the title Enalbe riscv64 R2R in installer Enable riscv64 R2R in installer Jan 13, 2024
@am11
Copy link
Member Author

am11 commented Jan 13, 2024

clr+libs+packs build logs. It gets stuck in this loop (where linker calls cecil):

td = typeReference.Resolve ();

  crossgen2_publish -> /runtime/artifacts/bin/crossgen2_publish/riscv64/Release/crossgen2.dll
  Optimizing assemblies for size. This process might take a while.
  Stack overflow.
     at Mono.Cecil.TypeReference.get_DeclaringType()
     at Mono.Cecil.TypeReference.get_IsNested()
     at Mono.Cecil.MetadataResolver.GetTypeDefinition(Mono.Cecil.ModuleDefinition, Mono.Cecil.TypeReference)
     at Mono.Cecil.MetadataResolver.GetType(Mono.Cecil.ModuleDefinition, Mono.Cecil.TypeReference)
     at Mono.Cecil.MetadataResolver.Resolve(Mono.Cecil.TypeReference)
     at Mono.Cecil.ModuleDefinition.Resolve(Mono.Cecil.TypeReference)
     at Mono.Cecil.ExportedType.Resolve()
     at Mono.Cecil.MetadataResolver.GetType(Mono.Cecil.ModuleDefinition, Mono.Cecil.TypeReference)
     at Mono.Cecil.MetadataResolver.Resolve(Mono.Cecil.TypeReference)
     at Mono.Cecil.ModuleDefinition.Resolve(Mono.Cecil.TypeReference)
     at Mono.Cecil.ExportedType.Resolve()
...
     at Mono.Cecil.MetadataResolver.GetType(Mono.Cecil.ModuleDefinition, Mono.Cecil.TypeReference)
     at Mono.Cecil.MetadataResolver.Resolve(Mono.Cecil.TypeReference)
     at Mono.Cecil.ModuleDefinition.Resolve(Mono.Cecil.TypeReference)
     at Mono.Cecil.TypeReference.Resolve()
     at Mono.Linker.LinkContext.Resolve(Mono.Cecil.TypeReference)
     at Mono.Linker.Steps.MarkStep.MarkType(Mono.Cecil.TypeReference, Mono.Linker.DependencyInfo, System.Nullable`1<Mono.Linker.MessageOrigin>)
     at Mono.Linker.Steps.MarkStep.MarkType(Mono.Cecil.TypeReference, Mono.Linker.DependencyInfo, System.Nullable`1<Mono.Linker.MessageOrigin>)
     at Mono.Linker.Steps.MarkStep.ProcessMarkedPending()
     at Mono.Linker.Steps.MarkStep.Initialize()
     at Mono.Linker.Steps.MarkStep.Process(Mono.Linker.LinkContext)
     at Mono.Linker.Pipeline.ProcessStep(Mono.Linker.LinkContext, Mono.Linker.Steps.IStep)
     at Mono.Linker.Pipeline.Process(Mono.Linker.LinkContext)
     at Mono.Linker.Driver.Run(Mono.Linker.ILogger)
     at Mono.Linker.Driver.Main(System.String[])

@am11
Copy link
Member Author

am11 commented Jan 22, 2024

Updating a few targets with @akoeplinger's help in SDK and installer fixed the issue with resolving the references.


Now R2R publishing is running into a codegen issue:

/runtime/src/coreclr/jit/lclvars.cpp:5350
Assertion failed 'codeGen->isFramePointerUsed()' in 'ILCompiler.AllowDuplicates:CompareAsEqual():this' during 'Linear scan register alloc' (IL size 2; hash 0x5cf1352e; FullOpts)

/runtime/src/coreclr/jit/lclvars.cpp:5350
Assertion failed 'codeGen->isFramePointerUsed()' in 'ILCompiler.DependencyAnalysisFramework.DependencyNode:Equals(System.Object):ubyte:this' during 'Linear scan register alloc' (IL size 10; hash 0xbf653aae; FullOpts)

/runtime/src/coreclr/jit/lclvars.cpp:5350
Assertion failed 'codeGen->isFramePointerUsed()' in 'ILCompiler.DependencyAnalysisFramework.DependencyNode:get_Marked():ubyte:this' during 'Linear scan register alloc' (IL size 15; hash 0xa73f8792; FullOpts)

Aborted

@ashaurtaev, @gbalykov, getting the full command line is a bit tricky (strace, inotify to capture the short-lived .rsp in temp etc.), but since I already collected it, here is a short gist how to get to the repro point:

# on ubuntu x64 machine 

# prereqs:
# sudo apt install -y curl libssl-dev cmake liblttng-ust-dev libkrb5-dev libz-dev libssl-dev lsb-release wget software-properties-common gnupg; curl -sSL https://apt.llvm.org/llvm.sh | sudo bash -s - 17 all

# build on host
$ git clone https://github.com/am11/runtime -b feature/crossgen2/riscv64 --single-branch --depth 1
$ cd runtime
# copy readymade rootfs from docker
$ docker run --rm -v$(pwd):/c --rm mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-22.04-cross-riscv64 sh -c 'cp -r /crossrootfs /c/.tools'
# build in Debug
$ ROOTFS_DIR=$(pwd)/.tools/riscv64  ./build.sh clr+libs+packs -a riscv64 --cross

$ curl -sSL http://sprunge.us/7iPEth -o debug.sh
$ chmod +x debug.sh
$ ./debug.sh
# this will get you to lldb REPL

Then run in lldb-17:

(lldb) r
(lldb) clrstack -f
OS Thread Id: 0x1b337 (38)
        Child SP               IP Call Site
00007F908F7F9368 00007F90CFB3FA2D libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol12984 + 1
00007F908F7F9370 00007F90CFAAF0DB libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol11554 + 283
00007F908F7F93B0 00007F90CF7D7532 libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol3192 + 610
00007F908F7FB470 00007F90CF918228 libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol6675 + 1192
00007F908F7FB4E0 00007F90CF916334 libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol6671 + 436
00007F908F7FB510 00007F90CF919D32 libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol6698 + 130
00007F908F7FB540 00007F90CF7A833F libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol2269 + 31
00007F908F7FB580 00007F90CF95FA8F libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol7501 + 703
00007F908F7FB5E0 00007F90CF97C655 libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol7844 + 485
00007F908F7FB780 00007F90CF95638B libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol7466 + 155
00007F908F7FB7B0 00007F90CF7BAA1F libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol2611 + 31
00007F908F7FB7D0 00007F90CF7BA9E9 libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol2610 + 25
00007F908F7FB7F0 00007F90CF9D4BCC libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol8542 + 76
00007F908F7FB850 00007F90CF7ABCB9 libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol2318 + 73
00007F908F7FB8C0 00007F90CF7AA9E9 libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol2282 + 7289
00007F908F7FC230 00007F90CF7AFFCD libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol2364 + 4173
00007F908F7FC350 00007F90CF7AE1DC libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol2357 + 92
00007F908F7FC380 00007F90CF7ADB25 libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol2344 + 2277
00007F908F7FC590 00007F90CF7B879E libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol2554 + 398
00007F908F7FC5D0 00007F90CF7B1FA1 libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol2405 + 113
00007F908F7FC670 00007F90CF7B1C2F libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol2400 + 527
00007F908F7FCD10 00007F90CF7C20CE libclrjit_unix_riscv64_x64.so!___lldb_unnamed_symbol2775 + 366
00007F908F7FCE10 00007F911457B539 libjitinterface_x64.so!JitCompileMethod + 185
00007F908F7FCEB0 00007FD1C3CD52B9 
00007F908F7FCED0                  [InlinedCallFrame: 00007f908f7fced0] ILCompiler.ReadyToRun.dll!Internal.JitInterface.CorInfoImpl.JitCompileMethod(IntPtr ByRef, IntPtr, IntPtr, IntPtr, Internal.JitInterface.CORINFO_METHOD_INFO ByRef, UInt32, IntPtr ByRef, UInt32 ByRef)
00007F908F7FCED0                  [InlinedCallFrame: 00007f908f7fced0] ILCompiler.ReadyToRun.dll!Internal.JitInterface.CorInfoImpl.JitCompileMethod(IntPtr ByRef, IntPtr, IntPtr, IntPtr, Internal.JitInterface.CORINFO_METHOD_INFO ByRef, UInt32, IntPtr ByRef, UInt32 ByRef)
00007F908F7FCEB0 00007FD1C3CD52B9 ILCompiler.ReadyToRun.dll!ILStubClass.IL_STUB_PInvoke(IntPtr ByRef, IntPtr, IntPtr, IntPtr, Internal.JitInterface.CORINFO_METHOD_INFO ByRef, UInt32, IntPtr ByRef, UInt32 ByRef) + 537
00007F908F7FD040 00007FD1C3CD151F ILCompiler.ReadyToRun.dll!Internal.JitInterface.CorInfoImpl.CompileMethodInternal(ILCompiler.DependencyAnalysis.IMethodNode, Internal.IL.MethodIL) + 399
00007F908F7FD2C0 00007FD1C3CCE01A ILCompiler.ReadyToRun.dll!Internal.JitInterface.CorInfoImpl.CompileMethod(ILCompiler.DependencyAnalysis.ReadyToRun.MethodWithGCInfo, ILCompiler.Logger) + 3386
00007F908F7FD780 00007FD1C3CCA822 ILCompiler.ReadyToRun.dll!ILCompiler.ReadyToRunCodegenCompilation+<>c__DisplayClass50_0.<ComputeDependencyNodeDependencies>g__CompileOneMethod|5(ILCompiler.DependencyAnalysisFramework.DependencyNodeCore`1<ILCompiler.DependencyAnalysis.NodeFactory>, Int32) + 1570
00007F908F7FDA50 00007FD1C3CCA191 ILCompiler.ReadyToRun.dll!ILCompiler.ReadyToRunCodegenCompilation+<>c__DisplayClass50_0.<ComputeDependencyNodeDependencies>g__CompileOnThread|4(Int32) + 321
00007F908F7FDAC0 00007FD1C3CC9FED ILCompiler.ReadyToRun.dll!ILCompiler.ReadyToRunCodegenCompilation+<>c__DisplayClass50_0.<ComputeDependencyNodeDependencies>g__CompilationThread|3(System.Object) + 205
00007F908F7FDB10 00007FD24129C8F7 libcoreclr.so!___lldb_unnamed_symbol10221 + 124
00007F908F7FDB30 00007FD2410D47C6 libcoreclr.so!___lldb_unnamed_symbol4878 + 246
00007F908F7FDBC0 00007FD2410EA212 libcoreclr.so!___lldb_unnamed_symbol5108 + 146
00007F908F7FDC10 00007FD2410A3405 libcoreclr.so!___lldb_unnamed_symbol4256 + 309
00007F908F7FDC50                  [DebuggerU2MCatchHandlerFrame: 00007f908f7fdc50] 
00007F908F7FDD20 00007FD2410A39CD libcoreclr.so!___lldb_unnamed_symbol4257 + 45
00007F908F7FDD50 00007FD2410EA2E8 libcoreclr.so!___lldb_unnamed_symbol5109 + 184
00007F908F7FDDB0 00007FD24140F04E libcoreclr.so!___lldb_unnamed_symbol16024 + 510
00007F908F7FDE60 00007FD241654AC3 libc.so.6!___lldb_unnamed_symbol3481 + 755
00007F908F7FDF00 00007FD2416E5A04 libc.so.6!__clone + 68

@gbalykov
Copy link
Member

gbalykov commented Jan 22, 2024

@am11 please note that crossgen2 for RISC-V is still under development and #95188 added just initial support (current state in main branch for reference #95188 (comment)).

@am11
Copy link
Member Author

am11 commented Jan 22, 2024

@gbalykov, thanks, I'm aware of that PR which is why I pinged you guys. :) This is just a datapoint for R2R publishing of real world app (crossgen2 itself). Most (if not all) the moving pieces from runtime->sdk->installer->runtime are set to test regular R2R publishing with dotnet-publish.

@am11
Copy link
Member Author

am11 commented Jan 30, 2024

SIGABRT and other issues are fixed by #97368, yay! With current state of this PR packs subset succeeds; crossgen2 package itself gets published with PublishReadyToRun=true.

We would now need to wait for preview 1 update at its scheduled time (estimation is mid-Feb). When it is updated in main branch, I'll revert global.json changes made here, merge main and the PR mark it ready for review.

@am11 am11 marked this pull request as ready for review February 28, 2024 11:44
@am11
Copy link
Member Author

am11 commented Feb 28, 2024

#98476 was merged, this is now ready. cc @dotnet/runtime-infrastructure

@am11
Copy link
Member Author

am11 commented Feb 28, 2024

Failures are known according to build analysis.

@jkotas
Copy link
Member

jkotas commented Feb 28, 2024

@gbalykov Could you please review and signoff on this change?

@@ -69,5 +69,11 @@
<UnofficialBuildRID Include="linux-musl-ppc64le">
<Platform>ppc64le</Platform>
</UnofficialBuildRID>
<UnofficialBuildRID Include="linux-riscv64">
<Platform>riscv64</Platform>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I right that this things like this and the ones in eng/targetingpacks.targets will help with build issues like in #97791 (There is no application host available for the specified RuntimeIdentifier 'linux-riscv64')?

Also after update of sdk to 9.0.1-preview1 (#98476), similar issue started to happen during coreclr tests build:

/home/runtime/src/tests/Common/test_dependencies_fs/test_dependencies.fsproj : error NU1101: Unable to find package Microsoft.NETCore.App.Runtime.linux-riscv64. No packages exist with this id in source(s): /home/runtime/.dotnet/sdk/9.0.100-preview.1.24101.2/FSharp/library-packs, dotnet-eng, dotnet-libraries, dotnet-libraries-transport, dotnet-public, dotnet-tools, dotnet9, dotnet9-transport [/home/runtime/src/tests/build.proj]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be also related to #96978 even tough Crossgen2 should be disabled currently ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, it will help those.

@@ -11,8 +11,6 @@
<PublishReadyToRun Condition="'$(TargetOS)' == 'netbsd' or '$(TargetOS)' == 'illumos' or '$(TargetOS)' == 'solaris' or '$(TargetOS)' == 'haiku'">false</PublishReadyToRun>
<!-- Disable crossgen on FreeBSD when cross building from Linux. -->
<PublishReadyToRun Condition="'$(TargetOS)'=='freebsd' and '$(CrossBuild)'=='true'">false</PublishReadyToRun>
<!-- Disable crossgen on riscv64. -->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about removal of this right now, because currently crossgen2 is still not fully functional. Crossgen2 itself is compiled with crossgen2 with no issues, but then it crashes during launch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't use this crossgen2 build in tests yet (#80154). So the tests will continue to use dotnet crossgen2.dll until (at least) preview 2 when arm NativeAOT changes are in.

In the nuget package for end-user, we will have PublishReadyToRun variant with this PR, which makes sense since the whole riscv64 project is in-progress?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think this is fine. We usually use crossgen2.dll directly, so we'll be able to continue doing so. And having crossgened crossgen2 in nuget is not much of a difference, since plain crossgen2 in nuget can also show issues on certain dlls.

Copy link
Member

@gbalykov gbalykov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@akoeplinger akoeplinger merged commit 23b4030 into dotnet:main Mar 4, 2024
188 checks passed
@am11 am11 deleted the feature/crossgen2/riscv64 branch March 4, 2024 13:51
@github-actions github-actions bot locked and limited conversation to collaborators Apr 4, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-riscv Related to the RISC-V architecture area-ReadyToRun-coreclr
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants