Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

correctly rounded divide test for half is not using a correctly rounded reference #1996

Open
b-sumner opened this issue Jul 1, 2024 · 9 comments
Labels
mobica-backlog Issue approved by WG for Mobica to work on

Comments

@b-sumner
Copy link
Contributor

b-sumner commented Jul 1, 2024

The new correctly rounded divide test for half precision, located in binary_operator_half.cpp is using an fptr for its reference function and computing the reference like this:

    s[j] = HTF(p[j]);
    s2[j] = HTF(p2[j]);
    r[j] = HFF(func.f_ff(s[j], s2[j]));

Here func.f_ff works out to reference_divide(). So r[j] starts with the double precision rounded result of the divide, rounds it to single precision and then rounds that to half. That's 3 roundings instead of the required single rounding.

Shouldn't this test be disabled until a correct reference is used?

@svenvh
Copy link
Member

svenvh commented Jul 2, 2024

The new correctly rounded divide test for half precision

Which reminds me, we should probably disable the correctly rounded tests for fp16 and fp64; see #1901 .

However, for fp16 I believe the divide_cr and divide tests are identical, since fp16 division must be correctly rounded (at least for the full profile). So if I'm not mistaken this issue also affects the regular fp16 divide test?

svenvh added a commit to svenvh/OpenCL-CTS that referenced this issue Jul 2, 2024
Skip the correctly rounded divide (divide_cr) and sqrt (sqrt_cr)
tests for fp16 and fp64.

The corresponding build option to enable correctly rounded divide and
sqrt is named `-cl-fp32-correctly-rounded-divide-sqrt` and the
description refers only to "single precision floating-point", so this
option should not apply to fp16 or fp64.

The specification states that fp16 and fp64 divide and sqrt must be
correctly rounded for the full profile, without needing any additional
build options.  This is already tested by the regular divide and sqrt
tests.  For the embedded profile the ULP requirement is non-zero, but
there is no build option to request a correctly rounded implementation
anyway.

Fixes KhronosGroup#1901 .
Relates to KhronosGroup#1996 .

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
@rjodinchr
Copy link
Contributor

rjodinchr commented Jul 3, 2024

When running the new fp16 divide test I get this kind of error:

ERROR: divide: 0.500000 ulp error at {0x1.428p-13 (0x090a), -0x1.ep+6 (0xd780)}
Expected: -0x1.6p-20  (half 0x8016)
Actual: -0x1.5p-20 (half 0x8015) at index: 61714

As the specification states that fp16 precision for divide is <= 1 ulp, I'm not sure if this is a real error, or a bug somewhere in the CTS.

@rjodinchr
Copy link
Contributor

rjodinchr commented Jul 3, 2024

Alright, it feels like the the half_ulp is set to 0.0f here. I'll open another issue and will make a PR for it.
My mistake, I was looking at the embedded profile. For the full profile, the division needs to be correctly rounded.

@svenvh
Copy link
Member

svenvh commented Jul 3, 2024

Alright, it feels like the the half_ulp is set to 0.0f here. I'll open another issue and will make a PR for it. My mistake, I was looking at the embedded profile. For the full profile, the division needs to be correctly rounded.

For the embedded profile we need to account for the <= 1 ulp, so we'll still need a separate issue I think (edit: we actually have #1685 for this already).

@svenvh
Copy link
Member

svenvh commented Jul 10, 2024

@b-sumner are you seeing any failures related to the excess roundings? If so, could you share some of the problematic input values?

@jlewis-austin
Copy link
Contributor

I haven't studied it closely, but this suggests some theorems for safe double rounding that could be helpful: https://dl.acm.org/doi/abs/10.1145/221332.221334

svenvh added a commit to svenvh/OpenCL-CTS that referenced this issue Jul 30, 2024
Skip the correctly rounded divide (divide_cr) and sqrt (sqrt_cr)
tests for fp16 and fp64.

The corresponding build option to enable correctly rounded divide and
sqrt is named `-cl-fp32-correctly-rounded-divide-sqrt` and the
description refers only to "single precision floating-point", so this
option should not apply to fp16 or fp64.

The specification states that fp16 and fp64 divide and sqrt must be
correctly rounded for the full profile, without needing any additional
build options.  This is already tested by the regular divide and sqrt
tests.  For the embedded profile the ULP requirement is non-zero, but
there is no build option to request a correctly rounded implementation
anyway.

Fixes KhronosGroup#1901 .
Relates to KhronosGroup#1996 .

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
@b-sumner
Copy link
Contributor Author

b-sumner commented Aug 2, 2024

I thought I saw another patch about dropping the correctly rounded tests for non-fp32 floating point types, so perhaps this should be closed.

But while I'm here I want to ask a related question about the max ulp error for half division. What I see in function_list.cpp is:

412 { "divide",
413 "/",
414 { (void*)reference_divide },
415 { (void*)reference_dividel },
416 { (void*)reference_relaxed_divide },
417 2.5f,
418 0.0f,
419 0.0f,
420 3.0f,
421 2.5f,
422 INFINITY,
423 FTZ_OFF,
424 RELAXED_ON,
425 binaryOperatorF },

Line 419 sets the half_ulps to 0.0. I would like to know where in the spec this requirement of correct rounding for half division appears or if the specified error is higher but simply not properly reflected in the test.

@svenvh
Copy link
Member

svenvh commented Aug 2, 2024

I would like to know where in the spec this requirement of correct rounding for half division appears

Table 69 in https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_C.html#relative-error-as-ulps

@b-sumner
Copy link
Contributor Author

b-sumner commented Aug 2, 2024

Thank you.

bashbaug pushed a commit that referenced this issue Aug 6, 2024
…1997)

Skip the correctly rounded divide (divide_cr) and sqrt (sqrt_cr) tests
for fp16 and fp64.

The corresponding build option to enable correctly rounded divide and
sqrt is named `-cl-fp32-correctly-rounded-divide-sqrt` and the
description refers only to "single precision floating-point", so this
option should not apply to fp16 or fp64.

The specification states that fp16 and fp64 divide and sqrt must be
correctly rounded for the full profile, without needing any additional
build options. This is already tested by the regular divide and sqrt
tests. For the embedded profile the ULP requirement is non-zero, but
there is no build option to request a correctly rounded implementation
anyway.

Fixes #1901 .
Relates to #1996 .

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
@kpet kpet added the mobica-backlog Issue approved by WG for Mobica to work on label Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mobica-backlog Issue approved by WG for Mobica to work on
Projects
None yet
Development

No branches or pull requests

5 participants