-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPIR-V Extension] fpbuiltin-max-error support #2056
[SPIR-V Extension] fpbuiltin-max-error support #2056
Conversation
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
…port_for_SPV_INTEL_fp_max_error_spec_extension
…port_for_SPV_INTEL_fp_max_error_spec_extension
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Draft mode till SPIR-V Header changes get checked in. Thanks |
@MrSidims, @vmaksimo, @maksimsab. @LU-JOHN, @jgstarIntel Thanks |
I think we've talked about this, but I can't remember the reasoning. Can you tell me why we need to translate this to metadata and not an attribute at the call site when translating from SPIR-V back to LLVM IR? The metadata is subject to being dropped, which would change the semantics of the call. If we must use metadata, the standard "fpmath" metadata has the same meaning. We decided not to use that with the llvm.fpbuiltin intrinsics because of the potential loss of semantics if the metadata is dropped. This is only really acceptable for operations that default to correctly rounded implementations. |
Hi @vmustya Can you please provide your feedback here? I remember you mentioned that we use metadata here (instead of attributes) during the 'spec' discussions. Can you please let us know if metadata can be replaced by attributes here? Thanks |
The attributes look good enough to me. I've only voted against non-standard custom intrinsics. |
Adding @GarveyJoe and @shuoniu-intel for more comments on this. Thanks |
Since Victor is OK with using an attribute, I definitely think that's the right thing to do. We can't guarantee that metadata would not be lost before it is needed. |
I have several high level questions to start with.
|
Hi @MrSidims
|
lib/SPIRV/SPIRVWriter.cpp
Outdated
SPIRVInstruction *I) { | ||
const bool AllowFPMaxError = | ||
BM->isAllowedToUseExtension(ExtensionID::SPV_INTEL_fp_max_error); | ||
bool IsLLVMFPBuiltin = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llvm.fpbuiltin.* is a set of new LLVM builtins introduced in the open-source SYCL compiler (intel/llvm#8134). It is not yet upstreamed to LLVM.org compiler.
Support to translate these builtins currently rely on matching the name of the called function (as done here).
This matching will be replaced with matching with the actual Intrinsic ID if and when the front-end compiler change is available in LLVM.org.
Can you please take a look at this and comment if this is an agreeable solution?
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it's worth starting an RFC thread on discourse.llvm.org first, to see if the intrinsics have a chance of getting accepted into llvm main
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @svenvh
Thanks for comment. There is already a thread open here: https://discourse.llvm.org/t/rfc-floating-point-accuracy-control/66018. Discussion is ongoing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing to the thread. I've skimmed over it, but it seems there hasn't been strong consensus yet and the discussion seems to have stalled. With that in mind, I'd probably prefer to avoid the llvm.
prefix for the new "intrinsics", as they aren't LLVM intrinsics really.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @svenvh
Thanks for looking at this. The builtin naming is actually coming from intel/llvm#8134 and we expect this to get added to llorg at some point of time. The current PR does not have control over the naming.
…asic math functions (add/sub/mul/div/rem) Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM
break; | ||
SPIRVType *STy = transType(II->getType()); | ||
std::vector<SPIRVValue *> Ops(1, transValue(II->getArgOperand(0), BB)); | ||
auto ExtOp = StringSwitch<SPIRVWord>(OpName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it suppose to work for ESIMD SYCL programming model? I'm asking because not all backends support OpenCL ext instructions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do expect the backends to atleast support the subset of instructions that are shown here.
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vmustya just checking your opinion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently IGC VC backend only supports the native_*
subset of OpenCL extended instructions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@asudarsa to consider changing math ext instructions to native_*
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andykaylor I'm talking about OpenCL builtins described here: https://registry.khronos.org/SPIR-V/specs/1.0/OpenCL.ExtendedInstructionSet.100.mobile.html native vs non-native. AFAIK IGC scalar support all of the builtins, while vector compiler support only 'native'. I just want to ensure, that we are on the same page and understand consequences of merging implementation going through non-native builtins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance-wise: that's what the spec says:
The function may map to one or more native device instructions and will typically have better performance compared to the non native corresponding functions. Support for denormal values is implementation-defined for this function
I can neither confirm nor deny such statement for Intel and others hardware.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MrSidims My concern is for the case where we're trying to restrict accuracy beyond the normal SYCL requirements. For example, the cos() function normally only requires 4 ulp accuracy, but I might want to call it with a 1 ulp accuracy requirement. My understanding of the native_ OCL instructions is that native instructions may be used regardless of their accuracy. So if we're trying to require 1 ulp accuracy, using the native_ instructions isn't appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andykaylor thanks for the explanation! I just wanted to ensure, that we understand that we sacrifice portability (at least temporary) and have a reasoning for it.
@asudarsa please resolve the conflict and I'll merge the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MrSidims Yes, sacrificing portability when the accuracy controls are used is expected. I expect that the accuracy controls will only be used by advanced users who are trying to fine-tune their implementations. I hope that if the feature is successful more vendors will add support for it and the portability problem will be resolved.
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Would be nice if @vmustya @vmaksimo @jgstarIntel @LU-JOHN and @maksimsab also took a look
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just one comment
SPIRVWord ID; | ||
if (Instruction *I = dyn_cast<Instruction>(V)) | ||
if (BV->hasDecorate(DecorationFPMaxErrorDecorationINTEL, 0, &ID)) { | ||
auto Literals = | ||
BV->getDecorationLiterals(DecorationFPMaxErrorDecorationINTEL); | ||
assert(Literals.size() == 1 && | ||
"FP Max Error decoration shall have 1 operand"); | ||
auto F = convertSPIRVWordToFloat(Literals[0]); | ||
if (CallInst *CI = dyn_cast<CallInst>(I)) { | ||
// Add attribute | ||
auto A = llvm::Attribute::get(*Context, "fpbuiltin-max-error", | ||
std::to_string(F)); | ||
CI->addFnAttr(A); | ||
} else { | ||
// Add metadata | ||
MDNode *N = | ||
MDNode::get(*Context, MDString::get(*Context, std::to_string(F))); | ||
I->setMetadata("fpbuiltin-max-error", N); | ||
} | ||
return true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just minor suggestion to wrap this to a separate function - to have more consistent code in transDecoration()
Signed-off-by: Sudarsanam, Arvind <arvind.sudarsanam@intel.com>
Signed-off-by: Sudarsanam, Arvind <arvind.sudarsanam@intel.com>
There is only a minor formatting issue which I will fix just before final merge. I think we can proceed with reviews/approvals. Thanks |
Signed-off-by: Sudarsanam, Arvind <arvind.sudarsanam@intel.com>
This changes add SPIR-V translator support for the SPIR-V extension documented here: KhronosGroup/SPIRV-Registry#193.
This extension adds one decoration to represent maximum error for FP operations and adds the related Capability.
SPIRV Headers support for representing this in SPIR-V: KhronosGroup/SPIRV-Headers#363
intel/llvm#8134 added a new call-site attribute associated with FP builtin intrinsics. This attribute is named 'fpbuiltin-max-error'.
Following example shows how this extension is supported in the translator. The input LLVM IR uses new LLVM builtin calls to represent FP operations. An attribute named 'fpbuiltin-max-error' is used to represent the max-error allowed in the FP operation.
Example
Input LLVM:
%t6 = call float @llvm.fpbuiltin.sin.f32(float %f1) #2
attributes #2 = { "fpbuiltin-max-error"="2.5" }
This is translated into a SPIR-V instruction (for add/sub/mul/div/rem) and OpenCl extended instruction for other instructions. A decoration to represent the max-error is attached to the SPIR-V instruction.
SPIR-V code:
4 Decorate 97 FPMaxErrorDecorationINTEL 1075838976
6 ExtInst 2 97 1 sin 88
No new support is added to support translating this SPIR_V back to LLVM. Existing support is used. The decoration is translated back into named metadata associated with the LLVM instruction. This can be readily consumed by backends.
Based on input from @andykaylor, we emit attributes when the FP operation is translated back to a call to a builtin function and emit metadata otherwise.
Translated LLVM code for basic math functions (add/sub/mul/div/rem):
%t6 = fmul float %f1, %f2, !fpbuiltin-max-error !7
!7 = !{!"2.500000"}
Translated LLVM code for other math functions:
%t6 = call spir_func float @_Z3sinf(float %f1) #3
attributes #3 = { "fpbuiltin-max-error"="4.000000" }
Thanks