Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure the scalar Reciprocal*Estimate APIs use AVX512 where possible #101800

Merged
merged 8 commits into from
May 4, 2024

Conversation

tannergooding
Copy link
Member

@tannergooding tannergooding commented May 2, 2024

This resolves #98732, resolves #101721, and resolves #101731

The general issue was that the TensorPrimitives APIs used AVX512 or SSE, depending on availability while the scalar APIs only used SSE. -- This was not reproducing everywhere as it only reproduces on AVX512 capable hardware, which not all CI machines are.

There is a subtle difference between the handling of denormalized values between the AVX512 and SSE instructions where AVX512 treats them negatives while SSE treats them as flushed to zero, which in turn causes a difference in result. This difference is fine and expected since this is explicitly an Estimate API and doesn't impact calls within a single process.

@stephentoub
Copy link
Member

stephentoub commented May 2, 2024

This resolves #98732, resolves #101721, and resolves #101731

Some of the failures linked in those issues were for other than the reciprocal estimates, e.g.

    System.Numerics.Tensors.Tests.SingleGenericTensorPrimitives.SpanSpanDestination_SpecialValues(tensorPrimitivesMethod: SpanSpanDestinationDelegate { Method = Void MaxNumber[Single](System.ReadOnlySpan`1[System.Single], System.ReadOnlySpan`1[System.Single], System.Span`1[System.Single]), Target = null }, expectedMethod: Func`3 { Method = Single MaxNumber(Single, Single), Target = null }, tolerance: null) [FAIL]
      Assert.All() Failure: 253 out of 256 items in the collection did not pass.
      [3]:   Item:  4
             Error: Assert.All() Failure: 18 out of 24 items in the collection did not pass.
                    [3]:  Item:  NaN
                          Error: Assert.Equal() Failure: Values differ
                                 Expected: 0.112421386
                                 Actual:   NaN
...

so this might not fully resolve it.

// isMagnitude - true if the intrinsic compares using the absolute value of the inputs
// isNumber - true if the intrinsic propagates the number; false for NaN
//
GenTree* Compiler::impEstimateIntrinsic(CORINFO_METHOD_HANDLE method,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC. @dotnet/jit-contrib, this did end up being a runtime issue of sorts.

There's a general assumption made by R2R that the managed implementation (for Corelib in particular) will be consistent across all paths. That was strictly not the case for the *Estimate APIs where they are non-deterministic across hardware by design. -- Put another way, R2R essentially assumes that Isa.IsSupported is equivalent to compOpportunisticallyDependsOn(Isa) for corelib, but it assumes its equivalent to compExactlyDependsOn(Isa) for other libraries.

As such, the only fix we have today is to move it into the JIT where we can explicitly use compExactlyDependsOn(Isa) instead and force things to be annotated correctly. This has a minor added bonus in that it improves codegen around the method slightly for more complex scenarios and makes it more consistent with the handling of other math APIs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it be solved via ExactlyDependsOn attributes on C# side?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not from what I saw. The analyzer will actually warn in the case that you try to do something like that, because of the expectation that the paths are semantically equivalent.

It's detailed more in depth here with some explanations of what is safe vs not safe for corelib:

- Any use of a platform intrinsic in the codebase MUST be wrapped with a call to an associated IsSupported property. This wrapping MUST be done within the same function that uses the hardware intrinsic, OR the function which uses the platform intrinsic must have the `CompExactlyDependsOn` attribute used to indicate that this function will unconditionally call platform intrinsics of from some type.

Doing it in the JIT, which is how all the other APIs that map down to 1 instruction are handled, ended up being the simpler approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tannergooding so does this PR solve the non-determinism? R2R still may prejit functions with these math functions, right?

Copy link
Member Author

@tannergooding tannergooding May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so does this PR solve the non-determinism

No, the non-determinism is intentional for these APIs. They are explicitly estimating APIs that are allowed to differ in result based on the hardware they're running on. What this does fix is the non-determinism within a single process.

R2R still may prejit functions with these math functions, right?

The use of compExactlyDependsOn basically tells R2R that this method requires Avx512F and if it isn't present the functionality may differ. That will mark methods that call it as requiring themselves to be jitted.

If we made these recursive and therefore mustExpand, then R2R could leave a call to the method and only have the method jitted instead. But, that's a more complicated change since ReciprocalSqrtEstimate requires us to emit a call to System.Math.Sqrt in the worst case and requires adding handling for that to platforms that don't currently have hardware intrinsic support (like Arm32).

This is a relatively uncommon API (basically just used in specialized high-perf scenarios), so fixing the failing CI legs seemed more important than worrying about the handful of APIs that would be disqualified from R2R (none of which are in corelib or other core libraries, System.Numerics.Tensors is the only place in our own code).

@tannergooding tannergooding merged commit 64d537a into dotnet:main May 4, 2024
161 of 166 checks passed
@tannergooding tannergooding deleted the fix-98732 branch May 4, 2024 01:29
@EgorBo
Copy link
Member

EgorBo commented May 9, 2024

michaelgsharp pushed a commit to michaelgsharp/runtime that referenced this pull request May 9, 2024
…otnet#101800)

* Ensure the scalar Reciprocal*Estimate APIs use AVX512 where possible

* Ensure that TensorPrimitives.Reciprocal APIs consistently use AVX512

* Move the implementation of ReciprocalEstimate and ReciprocalSqrtEstimate to the JIT so R2R works

* Ensure the relevant math intrinsics are marked betterToExpand

* Apply formatting patch

* Fixing the method header for impEstimateIntrinsic

* Don't use the AVX512 paths for TensorPrimitives.Reciprocal*Estimate APIs on .NET 8

* Workaround an issue where the TensorPrimitives net8 tests are not running on net8
Ruihan-Yin pushed a commit to Ruihan-Yin/runtime that referenced this pull request May 30, 2024
…otnet#101800)

* Ensure the scalar Reciprocal*Estimate APIs use AVX512 where possible

* Ensure that TensorPrimitives.Reciprocal APIs consistently use AVX512

* Move the implementation of ReciprocalEstimate and ReciprocalSqrtEstimate to the JIT so R2R works

* Ensure the relevant math intrinsics are marked betterToExpand

* Apply formatting patch

* Fixing the method header for impEstimateIntrinsic

* Don't use the AVX512 paths for TensorPrimitives.Reciprocal*Estimate APIs on .NET 8

* Workaround an issue where the TensorPrimitives net8 tests are not running on net8
@github-actions github-actions bot locked and limited conversation to collaborators Jun 9, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.