Skip to content

Commit

Permalink
math_brute_force: stop relying on volatile for IsTininessDetectedBefo…
Browse files Browse the repository at this point in the history
…reRounding (#2038)

This makes it literally impossible for drivers to constant fold the
IsTininessDetectedBeforeRounding kernel. Sure, drivers might have should
respect volatile here, but I'm not convinced this is actually required
by the spec in a very strict sense, because here there are no
side-effects possible in the first place.

And as far as I know, constant folding is allowed to give different
results than an actual GPU calculation would.

In any case, passing the constants via kernel arguments makes this
detection more reliable and one doesn't have to wonder why the fma test
is failing.

Side note: this was the last bug (known as of today) I had to fix in
order being able to make a CL CTS submission for Apple Silicon devices.
  • Loading branch information
karolherbst committed Aug 13, 2024
1 parent a406b34 commit 21ee05e
Showing 1 changed file with 20 additions and 3 deletions.
23 changes: 20 additions & 3 deletions test_conformance/math_brute_force/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1043,13 +1043,14 @@ int IsTininessDetectedBeforeRounding(void)
{
int error;
const char *kernelSource =
R"(__kernel void IsTininessDetectedBeforeRounding( __global float *out )
R"(__kernel void IsTininessDetectedBeforeRounding( __global float *out, float a, float b )
{
volatile float a = 0x1.000002p-126f;
volatile float b = 0x1.fffffcp-1f;
out[0] = a * b; // product is 0x1.fffffffffff8p-127
})";

float a = 0x1.000002p-126f;
float b = 0x1.fffffcp-1f;

clProgramWrapper query;
clKernelWrapper kernel;
error =
Expand All @@ -1073,6 +1074,22 @@ int IsTininessDetectedBeforeRounding(void)
return error;
}

if ((error = clSetKernelArg(kernel, 1, sizeof(a), &a)))
{
vlog_error("Error: Unable to set kernel arg to detect how tininess is "
"detected for the device. Err = %d",
error);
return error;
}

if ((error = clSetKernelArg(kernel, 2, sizeof(b), &b)))
{
vlog_error("Error: Unable to set kernel arg to detect how tininess is "
"detected for the device. Err = %d",
error);
return error;
}

size_t dim = 1;
if ((error = clEnqueueNDRangeKernel(gQueue, kernel, 1, NULL, &dim, NULL, 0,
NULL, NULL)))
Expand Down

0 comments on commit 21ee05e

Please sign in to comment.