Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPIR-V Extension] fpbuiltin-max-error support #2056

Merged
merged 19 commits into from
Sep 5, 2023

Conversation

asudarsa
Copy link
Contributor

@asudarsa asudarsa commented Jun 19, 2023

This changes add SPIR-V translator support for the SPIR-V extension documented here: KhronosGroup/SPIRV-Registry#193.
This extension adds one decoration to represent maximum error for FP operations and adds the related Capability.
SPIRV Headers support for representing this in SPIR-V: KhronosGroup/SPIRV-Headers#363

intel/llvm#8134 added a new call-site attribute associated with FP builtin intrinsics. This attribute is named 'fpbuiltin-max-error'.
Following example shows how this extension is supported in the translator. The input LLVM IR uses new LLVM builtin calls to represent FP operations. An attribute named 'fpbuiltin-max-error' is used to represent the max-error allowed in the FP operation.
Example
Input LLVM:
%t6 = call float @llvm.fpbuiltin.sin.f32(float %f1) #2
attributes #2 = { "fpbuiltin-max-error"="2.5" }

This is translated into a SPIR-V instruction (for add/sub/mul/div/rem) and OpenCl extended instruction for other instructions. A decoration to represent the max-error is attached to the SPIR-V instruction.

SPIR-V code:
4 Decorate 97 FPMaxErrorDecorationINTEL 1075838976
6 ExtInst 2 97 1 sin 88

No new support is added to support translating this SPIR_V back to LLVM. Existing support is used. The decoration is translated back into named metadata associated with the LLVM instruction. This can be readily consumed by backends.

Based on input from @andykaylor, we emit attributes when the FP operation is translated back to a call to a builtin function and emit metadata otherwise.

Translated LLVM code for basic math functions (add/sub/mul/div/rem):
%t6 = fmul float %f1, %f2, !fpbuiltin-max-error !7
!7 = !{!"2.500000"}

Translated LLVM code for other math functions:
%t6 = call spir_func float @_Z3sinf(float %f1) #3
attributes #3 = { "fpbuiltin-max-error"="4.000000" }

Thanks

Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
…port_for_SPV_INTEL_fp_max_error_spec_extension
…port_for_SPV_INTEL_fp_max_error_spec_extension
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
@asudarsa asudarsa marked this pull request as draft June 19, 2023 01:44
@asudarsa
Copy link
Contributor Author

Draft mode till SPIR-V Header changes get checked in.

Thanks

@asudarsa
Copy link
Contributor Author

@MrSidims, @vmaksimo, @maksimsab. @LU-JOHN, @jgstarIntel
Please take a look when convenient.

Thanks

@andykaylor
Copy link

I think we've talked about this, but I can't remember the reasoning. Can you tell me why we need to translate this to metadata and not an attribute at the call site when translating from SPIR-V back to LLVM IR? The metadata is subject to being dropped, which would change the semantics of the call.

If we must use metadata, the standard "fpmath" metadata has the same meaning. We decided not to use that with the llvm.fpbuiltin intrinsics because of the potential loss of semantics if the metadata is dropped. This is only really acceptable for operations that default to correctly rounded implementations.

@asudarsa
Copy link
Contributor Author

I think we've talked about this, but I can't remember the reasoning. Can you tell me why we need to translate this to metadata and not an attribute at the call site when translating from SPIR-V back to LLVM IR? The metadata is subject to being dropped, which would change the semantics of the call.

If we must use metadata, the standard "fpmath" metadata has the same meaning. We decided not to use that with the llvm.fpbuiltin intrinsics because of the potential loss of semantics if the metadata is dropped. This is only really acceptable for operations that default to correctly rounded implementations.

Hi @vmustya

Can you please provide your feedback here? I remember you mentioned that we use metadata here (instead of attributes) during the 'spec' discussions. Can you please let us know if metadata can be replaced by attributes here?

Thanks

@vmustya
Copy link
Contributor

vmustya commented Jun 21, 2023

Hi @vmustya

Can you please provide your feedback here? I remember you mentioned that we use metadata here (instead of attributes) during the 'spec' discussions. Can you please let us know if metadata can be replaced by attributes here?

Thanks

The attributes look good enough to me. I've only voted against non-standard custom intrinsics.

@asudarsa
Copy link
Contributor Author

I think we've talked about this, but I can't remember the reasoning. Can you tell me why we need to translate this to metadata and not an attribute at the call site when translating from SPIR-V back to LLVM IR? The metadata is subject to being dropped, which would change the semantics of the call.

If we must use metadata, the standard "fpmath" metadata has the same meaning. We decided not to use that with the llvm.fpbuiltin intrinsics because of the potential loss of semantics if the metadata is dropped. This is only really acceptable for operations that default to correctly rounded implementations.

Adding @GarveyJoe and @shuoniu-intel for more comments on this. Thanks

@andykaylor
Copy link

Since Victor is OK with using an attribute, I definitely think that's the right thing to do. We can't guarantee that metadata would not be lost before it is needed.

@MrSidims
Copy link
Contributor

MrSidims commented Jul 3, 2023

I have several high level questions to start with.
0. The intrinsics do not exist in llvm.org. Do we really want this patch present here and not in intel/llvm where these intrinsics are declared? If we want the patch here, shouldn't we guard their translation by an option like it's done for genx intrinsics? Can discuss it during WG call.

  1. Are these intrinsics coming from some high-level user/library API or from optimizations? If it's coming only from high-level APIs shouldn't we explore SPIR-V friendly LLVM IR mechanism capabilities for this feature, see the following paragraphs: https://github.com/KhronosGroup/SPIRV-LLVM-Translator/blob/main/docs/SPIRVRepresentationInLLVM.rst#id16 and https://github.com/KhronosGroup/SPIRV-LLVM-Translator/blob/main/docs/SPIRVRepresentationInLLVM.rst#id9 , using it we can get rid of the intrinsics completely in case if we don't have plans to upstream or llvm community doesn't want them. If the intrinsics can come from transformations, can't we live with just attribute/metadata in the input to SPIR-V translator IR if we generate this output anyway?
  2. What happens if the attribute/metadata came attached to ocl_sin builtin call? Shouldn't we handle it as well?
  3. We can't guarantee that metadata would not be lost before it is needed.
    The metadata is applied to external function call. It's hard to imagine losing it unless we link two modules with the following inlining (which I assume can happen later in some BE). But won't attribute be lost in this case as well?

@asudarsa
Copy link
Contributor Author

asudarsa commented Jul 6, 2023

Hi @MrSidims
Thanks much for the feedback.
Please find answers inlined below.

I have several high level questions to start with. 0. The intrinsics do not exist in llvm.org. Do we really want this patch present here and not in intel/llvm where these intrinsics are declared? If we want the patch here, shouldn't we guard their translation by an option like it's done for genx intrinsics? Can discuss it during WG call.
ANSWER: This is a good suggestion. One point to note. This PR does not require the intrinsics to exist in llvm.org. We rely on using string comparisons here to check for called builtin function names. So, these changes do work with llvm.org. But, I do not have any objections to move it to intel/llvm and cherry-pick to khronos once the intrinsics are available in llvm.org.

  1. Are these intrinsics coming from some high-level user/library API or from optimizations? If it's coming only from high-level APIs shouldn't we explore SPIR-V friendly LLVM IR mechanism capabilities for this feature, see the following paragraphs: https://github.com/KhronosGroup/SPIRV-LLVM-Translator/blob/main/docs/SPIRVRepresentationInLLVM.rst#id16 and https://github.com/KhronosGroup/SPIRV-LLVM-Translator/blob/main/docs/SPIRVRepresentationInLLVM.rst#id9 , using it we can get rid of the intrinsics completely in case if we don't have plans to upstream or llvm community doesn't want them. If the intrinsics can come from transformations, can't we live with just attribute/metadata in the input to SPIR-V translator IR if we generate this output anyway?
    ANSWER: I think these builtins are generated by compiler. Please see Add new intrinsics and attributes to control accuracy of FP calls intel/llvm#8134
    I think @andykaylor may be able to answer this better.
  1. What happens if the attribute/metadata came attached to ocl_sin builtin call? Shouldn't we handle it as well?
    ANSWER: Can you please provide an example, if possible? I did not understand this case. Sorry.
  2. We can't guarantee that metadata would not be lost before it is needed.
    The metadata is applied to external function call. It's hard to imagine losing it unless we link two modules with the following inlining (which I assume can happen later in some BE). But won't attribute be lost in this case as well?
    ANSWER: Base on my experiments, I think adding metadata is a better option as we do end up non-function FP operations in some cases and we cannot generate attributes for them.

SPIRVInstruction *I) {
const bool AllowFPMaxError =
BM->isAllowedToUseExtension(ExtensionID::SPV_INTEL_fp_max_error);
bool IsLLVMFPBuiltin =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

llvm.fpbuiltin.* is a set of new LLVM builtins introduced in the open-source SYCL compiler (intel/llvm#8134). It is not yet upstreamed to LLVM.org compiler.
Support to translate these builtins currently rely on matching the name of the called function (as done here).
This matching will be replaced with matching with the actual Intrinsic ID if and when the front-end compiler change is available in LLVM.org.

@svenvh,

Can you please take a look at this and comment if this is an agreeable solution?

Thanks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it's worth starting an RFC thread on discourse.llvm.org first, to see if the intrinsics have a chance of getting accepted into llvm main.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @svenvh

Thanks for comment. There is already a thread open here: https://discourse.llvm.org/t/rfc-floating-point-accuracy-control/66018. Discussion is ongoing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing to the thread. I've skimmed over it, but it seems there hasn't been strong consensus yet and the discussion seems to have stalled. With that in mind, I'd probably prefer to avoid the llvm. prefix for the new "intrinsics", as they aren't LLVM intrinsics really.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @svenvh

Thanks for looking at this. The builtin naming is actually coming from intel/llvm#8134 and we expect this to get added to llorg at some point of time. The current PR does not have control over the naming.

…asic math functions (add/sub/mul/div/rem)

Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
@asudarsa asudarsa marked this pull request as ready for review July 22, 2023 00:55
@asudarsa
Copy link
Contributor Author

Hi @svenvh and @MrSidims

SPIR-V headers and SPIR-V tools changes have been merged. We are now waiting on this PR. Can you please take a look when convenient?

Thanks
Sincerely

@asudarsa asudarsa requested a review from svenvh July 26, 2023 15:14
Copy link
Contributor

@MrSidims MrSidims left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM

lib/SPIRV/SPIRVReader.cpp Outdated Show resolved Hide resolved
break;
SPIRVType *STy = transType(II->getType());
std::vector<SPIRVValue *> Ops(1, transValue(II->getArgOperand(0), BB));
auto ExtOp = StringSwitch<SPIRVWord>(OpName)
Copy link
Contributor

@MrSidims MrSidims Jul 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it suppose to work for ESIMD SYCL programming model? I'm asking because not all backends support OpenCL ext instructions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do expect the backends to atleast support the subset of instructions that are shown here.

Thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vmustya just checking your opinion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently IGC VC backend only supports the native_* subset of OpenCL extended instructions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asudarsa to consider changing math ext instructions to native_*

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andykaylor I'm talking about OpenCL builtins described here: https://registry.khronos.org/SPIR-V/specs/1.0/OpenCL.ExtendedInstructionSet.100.mobile.html native vs non-native. AFAIK IGC scalar support all of the builtins, while vector compiler support only 'native'. I just want to ensure, that we are on the same page and understand consequences of merging implementation going through non-native builtins.

Copy link
Contributor

@MrSidims MrSidims Aug 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance-wise: that's what the spec says:
The function may map to one or more native device instructions and will typically have better performance compared to the non native corresponding functions. Support for denormal values is implementation-defined for this function
I can neither confirm nor deny such statement for Intel and others hardware.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MrSidims My concern is for the case where we're trying to restrict accuracy beyond the normal SYCL requirements. For example, the cos() function normally only requires 4 ulp accuracy, but I might want to call it with a 1 ulp accuracy requirement. My understanding of the native_ OCL instructions is that native instructions may be used regardless of their accuracy. So if we're trying to require 1 ulp accuracy, using the native_ instructions isn't appropriate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andykaylor thanks for the explanation! I just wanted to ensure, that we understand that we sacrifice portability (at least temporary) and have a reasoning for it.
@asudarsa please resolve the conflict and I'll merge the PR.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MrSidims Yes, sacrificing portability when the accuracy controls are used is expected. I expect that the accuracy controls will only be used by advanced users who are trying to fine-tune their implementations. I hope that if the feature is successful more vendors will add support for it and the portability problem will be resolved.

lib/SPIRV/SPIRVWriter.cpp Outdated Show resolved Hide resolved
lib/SPIRV/SPIRVWriter.cpp Show resolved Hide resolved
lib/SPIRV/SPIRVWriter.cpp Outdated Show resolved Hide resolved
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
@asudarsa asudarsa requested a review from MrSidims July 29, 2023 01:27
Copy link
Contributor

@MrSidims MrSidims left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Would be nice if @vmustya @vmaksimo @jgstarIntel @LU-JOHN and @maksimsab also took a look

Copy link
Contributor

@vmaksimo vmaksimo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one comment

Comment on lines +3887 to +3907
SPIRVWord ID;
if (Instruction *I = dyn_cast<Instruction>(V))
if (BV->hasDecorate(DecorationFPMaxErrorDecorationINTEL, 0, &ID)) {
auto Literals =
BV->getDecorationLiterals(DecorationFPMaxErrorDecorationINTEL);
assert(Literals.size() == 1 &&
"FP Max Error decoration shall have 1 operand");
auto F = convertSPIRVWordToFloat(Literals[0]);
if (CallInst *CI = dyn_cast<CallInst>(I)) {
// Add attribute
auto A = llvm::Attribute::get(*Context, "fpbuiltin-max-error",
std::to_string(F));
CI->addFnAttr(A);
} else {
// Add metadata
MDNode *N =
MDNode::get(*Context, MDString::get(*Context, std::to_string(F)));
I->setMetadata("fpbuiltin-max-error", N);
}
return true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just minor suggestion to wrap this to a separate function - to have more consistent code in transDecoration()

Signed-off-by: Sudarsanam, Arvind <arvind.sudarsanam@intel.com>
Signed-off-by: Sudarsanam, Arvind <arvind.sudarsanam@intel.com>
@asudarsa asudarsa requested a review from MrSidims August 31, 2023 20:28
@asudarsa
Copy link
Contributor Author

There is only a minor formatting issue which I will fix just before final merge. I think we can proceed with reviews/approvals.

Thanks

Signed-off-by: Sudarsanam, Arvind <arvind.sudarsanam@intel.com>
@MrSidims MrSidims merged commit c6fe12b into KhronosGroup:main Sep 5, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants