-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CodeGen for defaulted collection expression elements (Span via InlineArray) #74184
Comments
Why would you assume that? The user has explicitly asked the compiler to assign a value here, why should the compiler omit that? Consider a case where the user is say getting an array back from a pool and wants ensure parts of it are initialized. In that case the default value certainly cannot be omitted. |
In general you are absolutely right. I mean the spefific Span lowering via inline array. Inline Array structs are already default initialized after construction. So these assignments are noops which can be removed / optimized away. |
What if the user has disabled locals init behavior and are using the defaults to force initialization? |
Right, then it can't be optimized out. |
@jaredpar could you explain the situation when it can't be optimized out? The inline array is constructed by the compiler - not by the user, and the compiler emits the In my eyes the assignment of constant default values can be omitted always for inline array to span construction. This code can be using System;
foreach (var item in (Span<int>)[0, 1, 0]){
Console.WriteLine(item);
} lowered to: using System;
using System.Diagnostics;
using System.Reflection;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Security;
using System.Security.Permissions;
internal class Program
{
[SkipLocalsInit]
private static void Main(string[] args)
{
y__InlineArray3<int> buffer = default(y__InlineArray3<int>);
//not needed
//PrivateImplementationDetails.InlineArrayElementRef<y__InlineArray3<int>, int>(ref buffer, 0) = 0;
PrivateImplementationDetails.InlineArrayElementRef<y__InlineArray3<int>, int>(ref buffer, 1) = 1;
//not needed
//PrivateImplementationDetails.InlineArrayElementRef<y__InlineArray3<int>, int>(ref buffer, 2) = 0;
Span<int> s = PrivateImplementationDetails.InlineArrayAsSpan<y__InlineArray3<int>, int>(ref buffer, 3);
foreach (var item in s){
Console.WriteLine(item);
}
}
}
[CompilerGenerated]
internal sealed class PrivateImplementationDetails
{
internal static Span<TElement> InlineArrayAsSpan<TBuffer, TElement>(ref TBuffer buffer, int length)
{
return MemoryMarshal.CreateSpan(ref Unsafe.As<TBuffer, TElement>(ref buffer), length);
}
internal static ref TElement InlineArrayElementRef<TBuffer, TElement>(ref TBuffer buffer, int index)
{
return ref Unsafe.Add(ref Unsafe.As<TBuffer, TElement>(ref buffer), index);
}
}
[StructLayout(LayoutKind.Auto)]
[InlineArray(3)]
internal struct y__InlineArray3<T>
{
[CompilerGenerated]
private T _element0;
} |
I've listed a few already in the issue. Which one is unclear?
Optimizations need to be treated like features. The burden is to demonstrate that it is safe, not intuit that it's safe. For example explicitly listing the scenarios in which the optimization will occur and why specifically it's legal in that scenario. For example in the Span scenario you have to demonstrate that the memory which is not written will be guaranteed to have the default value already. Also: optimizations are not free. They are a significant source of regressions. That means there is also the burden of demonstrating this optimization is valuable / worth the risk trade off. I've seen no evidence of that in this issue. |
That doesn't matter because the inline-array-struct is explicitly defaulted. If it wouldn't I agree.
That is a very different case. I talk only about span-collection-expression lowering via inline-array, nothing else. I added a draft-PR already which seems to fail only for some code gen builder tests, where a temporary inline array is created for the collection builder. For heap / regular arrays this is already implemented, see: EmitArrayInitializer. I think we should do that for inline array, too. Especially because they are used only because of performance. |
I feel like you haven't fully responded to my last comment
Lacking these two items this issue and accompanying PRs cannot move forward. |
I mean this use case: using System;
using System.IO;
Stream someStream = null;
Span<byte> smallBuffer = [0,0,0,0,0,0,0,0,0,0];
someStream.Read(smallBuffer); As a user I expect that an inline array is constructed without any assigments. But as you see, today the compiler emits: using System;
using System.Diagnostics;
using System.IO;
using System.Reflection;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Security;
using System.Security.Permissions;
[assembly: CompilationRelaxations(8)]
[assembly: RuntimeCompatibility(WrapNonExceptionThrows = true)]
[assembly: Debuggable(DebuggableAttribute.DebuggingModes.IgnoreSymbolStoreSequencePoints)]
[assembly: SecurityPermission(SecurityAction.RequestMinimum, SkipVerification = true)]
[assembly: AssemblyVersion("0.0.0.0")]
[module: UnverifiableCode]
[module: RefSafetyRules(11)]
[CompilerGenerated]
internal class Program
{
private static void <Main>$(string[] args)
{
<>y__InlineArray9<byte> buffer = default(<>y__InlineArray9<byte>);
<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray9<byte>, byte>(ref buffer, 0) = 0;
<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray9<byte>, byte>(ref buffer, 1) = 0;
<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray9<byte>, byte>(ref buffer, 2) = 0;
<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray9<byte>, byte>(ref buffer, 3) = 0;
<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray9<byte>, byte>(ref buffer, 4) = 0;
<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray9<byte>, byte>(ref buffer, 5) = 0;
<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray9<byte>, byte>(ref buffer, 6) = 0;
<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray9<byte>, byte>(ref buffer, 7) = 0;
<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray9<byte>, byte>(ref buffer, 8) = 0;
Span<byte> buffer2 = <PrivateImplementationDetails>.InlineArrayAsSpan<<>y__InlineArray9<byte>, byte>(ref buffer, 9);
((Stream)null).Read(buffer2);
}
}
[CompilerGenerated]
internal sealed class <PrivateImplementationDetails>
{
internal static Span<TElement> InlineArrayAsSpan<TBuffer, TElement>(ref TBuffer buffer, int length)
{
return MemoryMarshal.CreateSpan(ref Unsafe.As<TBuffer, TElement>(ref buffer), length);
}
internal static ref TElement InlineArrayElementRef<TBuffer, TElement>(ref TBuffer buffer, int index)
{
return ref Unsafe.Add(ref Unsafe.As<TBuffer, TElement>(ref buffer), index);
}
}
[StructLayout(LayoutKind.Auto)]
[InlineArray(9)]
internal struct <>y__InlineArray9<T>
{
[CompilerGenerated]
private T _element0;
} And I want: using System;
using System.Diagnostics;
using System.IO;
using System.Reflection;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Security;
using System.Security.Permissions;
[assembly: CompilationRelaxations(8)]
[assembly: RuntimeCompatibility(WrapNonExceptionThrows = true)]
[assembly: Debuggable(DebuggableAttribute.DebuggingModes.IgnoreSymbolStoreSequencePoints)]
[assembly: SecurityPermission(SecurityAction.RequestMinimum, SkipVerification = true)]
[assembly: AssemblyVersion("0.0.0.0")]
[module: UnverifiableCode]
[module: RefSafetyRules(11)]
[CompilerGenerated]
internal class Program
{
private static void <Main>$(string[] args)
{
<>y__InlineArray9<byte> buffer = default(<>y__InlineArray9<byte>);
Span<byte> buffer2 = <PrivateImplementationDetails>.InlineArrayAsSpan<<>y__InlineArray9<byte>, byte>(ref buffer, 9);
((Stream)null).Read(buffer2);
}
}
[CompilerGenerated]
internal sealed class <PrivateImplementationDetails>
{
internal static Span<TElement> InlineArrayAsSpan<TBuffer, TElement>(ref TBuffer buffer, int length)
{
return MemoryMarshal.CreateSpan(ref Unsafe.As<TBuffer, TElement>(ref buffer), length);
}
internal static ref TElement InlineArrayElementRef<TBuffer, TElement>(ref TBuffer buffer, int index)
{
return ref Unsafe.Add(ref Unsafe.As<TBuffer, TElement>(ref buffer), index);
}
}
[StructLayout(LayoutKind.Auto)]
[InlineArray(9)]
internal struct <>y__InlineArray9<T>
{
[CompilerGenerated]
private T _element0;
}
In my eyes it is obvious that those assignments are unnecessary. The inline array is already defaulted (though in this case I don't care about initialization at all). |
Because we do: default(<>y__InlineArray9<byte>) |
If I would use stackalloc the generated code looks much better, but it coulde issues with inlining or in loops... using System;
using System.IO;
Stream someStream = null;
Span<byte> smallBuffer = stackalloc byte[9];
someStream.Read(smallBuffer); using an inline array under the hood seems to be the better solution because it is safe. |
That is not real data. Data is profiles showing that the optimization actually improves performance. Depending on intuition like this can, and does, lead to real world regressions. Further it really helps to see data around impact. For example how often does this pattern happen in the real world? An optimization might be demonstrably better but if no real world code hits it then what is the point in doing it?
There needs to be more specifics than a single case. It needs to be much more detailed about when an optimization happens. For example this optimization cannot happen when init locals is disable. Further it needs to touch on what other implications this has. Consider for example #73269 where we want to be smarter about temp re-use in the compiler (leads to stack size savings in real world scenarios). This optimization directly conflicts with that one because this optimization is brittle in the face of temporary re-use as it would allow uninitialized data to leak through. Items like that need to be discussed. |
I actually did a benchmark for this specific use case and observed that such unnecessary assignments to defaulted struct-fields are optimized away by the runtime (for Release builds) as long as the struct construction is in the same method / stack-frame (without inlining - which is the case here). See for example: using System;
var test = new Test();
test.A = null;
test.B = default;
test.B = default;
Console.WriteLine(test.A);
Console.WriteLine(test.B);
struct Test{
public string A;
public int B;
public int C;
} This is compiled to: Program..ctor()
L0000: ret
Program.<Main>$(System.String[])
L0000: push ebp
L0001: mov ebp, esp
L0003: xor ecx, ecx
L0005: call dword ptr [0x13756f58]
L000b: xor ecx, ecx
L000d: call dword ptr [0x13756ee0]
L0013: pop ebp
L0014: ret Which is quite perfect! So the real benefit woule be (only) less work for the runtime optimization and binary-size reduction. Maybe strange is that such an optimization is not implemented for locally constructed arrays or classes. E.g.: using System;
var test = new Test();
test.A = null;
test.B = default;
test.B = default;
Console.WriteLine(test.A);
Console.WriteLine(test.B);
class Test{
public string A;
public int B;
public int C;
} is compiled to: ; Core CLR 8.0.624.26715 on x86
Program..ctor()
L0000: ret
Program.<Main>$(System.String[])
L0000: push ebp
L0001: mov ebp, esp
L0003: push esi
L0004: mov ecx, 0x277aca2c
L0009: call 0x06ec300c
L000e: mov esi, eax
L0010: xor ecx, ecx
L0012: mov [esi+4], ecx
L0015: mov [esi+8], ecx
L0018: mov [esi+8], ecx
L001b: mov ecx, [esi+4]
L001e: call dword ptr [0x13756f58]
L0024: mov ecx, [esi+8]
L0027: call dword ptr [0x13756ee0]
L002d: pop esi
L002e: pop ebp
L002f: ret
Test..ctor()
L0000: ret I guess the runtime assumes always other references to the heap storage. So I would understand if you say - well it is optimized enough by the runtime. |
The benchmark code I used was: using System;
using System.Runtime.CompilerServices;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Configs;
using BenchmarkDotNet.Running;
var config = DefaultConfig.Instance;
var summary = BenchmarkRunner.Run <TestInlineArray>(config, args);
[DisassemblyDiagnoser]
//[RPlotExporter]
[SimpleJob(runtimeMoniker: BenchmarkDotNet.Jobs.RuntimeMoniker.Net80)]
//[SimpleJob(runtimeMoniker: BenchmarkDotNet.Jobs.RuntimeMoniker.Mono)]
public class TestInlineArray
{
[Benchmark]
public void CreateEmptySpanWithCollectionExpression()
{
Span<byte> bytes = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0];
UseSpan(bytes);
}
[Benchmark]
public void CreateEmptySpanWithInlineArrayLikeTheCompilerDoesDirectAndWithDisabledGitOptimization()
{
var array = CreateInlineArraySoTheJitDoesNotSeeItIsDefaulted();
array[0] = 0;
array[1] = 0;
array[2] = 0;
array[3] = 0;
array[4] = 0;
array[5] = 0;
array[6] = 0;
array[7] = 0;
array[8] = 0;
array[9] = 0;
array[10] = 0;
array[11] = 0;
array[12] = 0;
array[13] = 0;
array[14] = 0;
array[15] = 0;
array[16] = 0;
array[17] = 0;
array[18] = 0;
array[19] = 0;
array[20] = 0;
array[21] = 0;
array[22] = 0;
array[23] = 0;
array[24] = 0;
array[25] = 0;
array[26] = 0;
array[27] = 0;
array[28] = 0;
array[29] = 0;
UseSpan(array);
[MethodImpl(MethodImplOptions.NoInlining)]
InlineArrayX CreateInlineArraySoTheJitDoesNotSeeItIsDefaulted()
{
return new();
}
}
[Benchmark]
public void CreateEmptySpanWithInlineArrayDirect()
{
var array = new InlineArrayX();
Span<byte> bytes = array;
UseSpan(bytes);
}
[Benchmark]
public void CreateEmptySpanWithStackAlloc()
{
Span<byte> bytes = stackalloc byte[30];
UseSpan(bytes);
}
private void UseSpan(Span<byte> span)
{
foreach (var s in span)
{
if (s != 0)
{
throw new Exception();
}
}
}
[InlineArray(30)]
private struct InlineArrayX
{
public byte Data;
}
}
The result was:
|
Thansk fro the benchmark. Closing for now because it seems like the runtime is handling these cases fairly good today. Can re-open if we find this is not the case or find real world cases where this is a problem. |
Version Used:
Steps to Reproduce:
Compile the following code:
Expected Behavior:
I would assume the assignment of elements with a default value could be omitted.
Actual Behavior:
The text was updated successfully, but these errors were encountered: