Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for llvm.vector.reduce.* intrinsics #2198

Merged
merged 9 commits into from
Nov 14, 2023

Conversation

bwlodarcz
Copy link
Contributor

@bwlodarcz bwlodarcz commented Nov 6, 2023

A set of llvm.vector.reduce.* intrinsics doesn't have straight forward operation equivalent on the SPIRV side. The commit needed be split on three sections: Integer, Floating-point and Comparison reductions.
On every category at first we need to extract each value from vector.
For Integer reductions we perform an operation on each pair of vector elements and repeat until there is only one value.
This way emitted code has the least number of dependencies between each operation and maximally utilizes instruction parallelism.
For FP reductions because of non associativity (due to rounding errors) we needed to emit sequential code in which every next operation is based on previous result. This solution is slow but correct. Potential way of improvement Floating-point is to use reassoc flag from -ffast-math.
For Comparison reductions we can utilize similar algorithm like in Integer section with the difference that now we need to emit Comp operation and OpSelect on which result is based like in cpp ternary operation. The rest is the same.

@bwlodarcz bwlodarcz force-pushed the llvm_vector_reduce branch 3 times, most recently from 42175d3 to a89ef46 Compare November 9, 2023 12:34
lib/SPIRV/SPIRVWriter.cpp Outdated Show resolved Hide resolved
lib/SPIRV/SPIRVWriter.cpp Outdated Show resolved Hide resolved
@@ -4023,6 +4024,90 @@ SPIRVValue *LLVMToSPIRVBase::transIntrinsicInst(IntrinsicInst *II,
transValue(II->getArgOperand(0), BB),
transValue(II->getArgOperand(1), BB), BB);
}
case Intrinsic::vector_reduce_add:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some details in a comment about the implementation here.

Thanks


target triple = "spir64-unknown-unknown"

; CHECK-SPIRV-DAG: TypeInt [[#I8TYPE:]] 8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do appreciate the thoroughness of this test. But I am not sure if it is really necessary to check for all data types here. If the implementation passes/fails for a particular data type, is there a reason why it will have a different result for some other data type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO there is. The major necessity here comes from the fact that translator during translation is a stateful machine. This actually requires from the test to contain at least some kind of combination of types to be sure that at least in some ways this stateful behavior is covered by tests. The second reason is to be 100% sure that the types aren't changing of the implementation behavior. And third (imo. weakest) is that the api calls which are used by implementation are also tested for correctness in that context.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about correctness as we are not actually running anything and creating results in the LIT tests. My argument is this. LIT tests that are being added here test specifically the sequence of SPIR-V instructions generated by 'your' implementation and if your implementation breaks one test that checks SPIR-V code generation for one type, it is going to break for all types.

Also, I am ok to have this discussion offline and not block this PR.

Thanks

@@ -0,0 +1,552 @@
; RUN: llvm-as %s -o %t.bc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the amount of effort that has already gone into writing this test, I will categorize this comment as 'feel free to ignore' and 'non-blocking'. Do we want to test the reverse translated LLVM code here?

Thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO. no. These are covered by appropriate instruction tests written in the past - such effort will be redundant when such tests are in place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I do not think we have instruction level tests anywhere. Please feel free to point me to such tests.

Thanks

asudarsa
asudarsa previously approved these changes Nov 10, 2023
Copy link
Contributor

@asudarsa asudarsa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits. Overall change looks good. If possible, it will be a good idea to verify the functionality using E2E tests though it is not required for this PR to be approved.
Also, for future reference, I see two places where the complexity of the testing can be reduced: (1) I think it's sufficient to test for a single data type. If the implementation failed in one data type, i suspect it's going to fail for all (2) I also think testing for add/and/mul/or/xor can be represented using a single test as the implementation is same for all these ops. Similarly for smax/smin/umax/umin.

Good job on writing the tests.

Thanks

@asudarsa asudarsa dismissed their stale review November 10, 2023 04:15

This PR is still in Draft state. Sorry, i jumped the gun by approving. Will wait for it to move out of 'draft' state.

A set of llvm.vector.reduce.* intrinsics doesn't have straight forward
operation equivalent on the SPIRV side. The easiest solution to this
problem is to use scalar operation on each pair of vector elements
and repeat until there is only one value.
@bwlodarcz bwlodarcz marked this pull request as ready for review November 13, 2023 12:58
Copy link
Contributor

@MrSidims MrSidims left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integer add/mul, or/and/xor and float add/mul are LGTM. Tomorrow will take a look at min/max

lib/SPIRV/SPIRVWriter.cpp Show resolved Hide resolved
@MrSidims MrSidims merged commit fe088cd into KhronosGroup:main Nov 14, 2023
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants