Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added cl_khr_fp16 extension support for test_decorate from spirv_new #1770

Merged
merged 12 commits into from
Feb 13, 2024

Conversation

shajder
Copy link
Contributor

@shajder shajder commented Jun 19, 2023

According to issue request #142

-changed naming convention of saturation .spvasm files related to
test_decorate of spirv_new
-restored float to char/uchar saturation tests
-few minor corrections
@shajder shajder marked this pull request as ready for review June 21, 2023 08:56
@shajder
Copy link
Contributor Author

shajder commented Jun 23, 2023

Close/Open to trigger CI

@shajder shajder closed this Jun 23, 2023
@shajder shajder reopened this Jun 23, 2023
\
for (int i = 0; i < num; i++) \
{ \
in[i] = num * genrand<clTi>(seed) - num / 2; \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a possibility with the above code we could overflow half float with the rounding but the float result we calculate would not overflow, so the results don't really match? Do we need to constrain the inputs more?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to comment on this points before I approve?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look at this yesterday and it's actually a little worse than I expected: genrand<cl_half>(seed) returns a short, not a floating-point value between zero and one, so almost all of these values overflow and we really only test one value. This needs to be fixed.

Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these changes look OK and they can be merged in this state. I've run them on a few of our devices and the new tests are passing.

Recording a few nice-to-have fixes, none of which should hold up merging this PR, more things we could consider doing in the future:

  • Implementation-wise think a lot of the checks for std::is_same<Ti, cl_half>::value could be removed with some clever template speciailizations and the code would likely be easier to read and understand as well.
  • Testing-wise there are only 64K halfs, so we could test all of them exhaustively vs. relying on random sampling.

@bashbaug
Copy link
Contributor

@coldav would you like to give this one more review before we merge these changes?

Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above - we need to fix the input data so it is generated properly.

bashbaug
bashbaug previously approved these changes Oct 21, 2023
Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've fixed these now - please take a look. My review approval isn't sufficient anymore since I've made some of the changes.

A few things I didn't do:

  • We could be a bit more clever with std::enable_if to remove the runtime checks for std::is_same, but I've left the runtime checks for now.
  • I've kept the random sampling vs. trying to test every half exhaustively.

isnan is currently implemented as a macro, not as a function, so
we can't use std::isnan.
@shajder
Copy link
Contributor Author

shajder commented Oct 21, 2023

  • We could be a bit more clever with std::enable_if to remove the runtime checks for std::is_same, but I've left the runtime checks for now.

@bashbaug it looks to me that std::is_same will be unrolled in compile time without conditional code (at least with modern compilers and some basic optimizations enabled). Am I missing something here ?

@bashbaug
Copy link
Contributor

it looks to me that std::is_same will be unrolled in compile time without conditional code (at least with modern compilers and some basic optimizations enabled).

Ah, good point! Yes, the "runtime" checks should be just fine then.

@Nuullll
Copy link
Contributor

Nuullll commented Dec 20, 2023

Verified with Intel OpenCL CPU implementation. New tests are passing!

Tl val = genrand<Tl>(seed) % hiVal;
if (std::is_same<cl_half, Ti>::value)
{
if (val > 0 && val * 20 < hiVal)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add some comments to explain the * 20 please?

@coldav
Copy link

coldav commented Jan 10, 2024

The test changes in general look good to me, but we see some failures locally, so I'd like to have a few days to check if it relates to the test.

template <typename T> T genrandReal_range(T low, T high, RandomSeed &seed)
{
T t = genrand_real1(seed);
return (1.0 - t) * low + t * high;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Floating point math is really difficult. If this had been t * low + (1.0 - t) * high, I know this could easily have resulted in random values outside of [low, high]. For (1.0 - t) * low + t * high, as long as low < high, I have not seen any values for which that can happen, but if this truly cannot happen, can you add a comment explaining why?

But I am actually not sure a uniform distribution is what we want here. Suppose we have a range of [1, 4]. There are as many floating point values in [1, 2) as there are in [2, 4). This means individual problematic cases in [1, 2) only have half the probability of being tested of those in [2, 4). At the same time, if there is a whole range of values that is problematic, you do want a uniform distribution. So I do not know what is best here.

@coldav
Copy link

coldav commented Jan 15, 2024

The test changes in general look good to me, but we see some failures locally, so I'd like to have a few days to check if it relates to the test.

I think we are happy with it if you want to merge apart from the clang compile issue mentioned by Harald.

Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the proposed fix for the Clang warning - see discussion above.

test_conformance/spirv_new/test_decorate.cpp Outdated Show resolved Hide resolved
@bashbaug
Copy link
Contributor

@coldav @hvdijk if either of you can approve this PR we can probably merge it (finally) in tomorrow's teleconference. Thanks!

@coldav
Copy link

coldav commented Feb 13, 2024

@hvdijk is away this week, but I've approved it.

Copy link
Contributor

@alycm alycm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing the GitHub approval on @coldav's behalf.

@bashbaug
Copy link
Contributor

Merging as discussed in the February 13th teleconference.

@bashbaug bashbaug merged commit a4b5a30 into KhronosGroup:main Feb 13, 2024
7 checks passed
bashbaug added a commit that referenced this pull request Mar 3, 2024
* allocations: Move results array from stack to heap (#1857)

* allocations: Fix stack overflow

* check format fixes

* Fix windows stack overflow. (#1839)

* thread_dimensions: Avoid combinations of very small LWS and very large GWS (#1856)

Modify the existing condition to include extremely small LWS like
1x1 on large GWS values

* c11_atomics: Reduce the loopcounter for sequential consistency tests (#1853)

Reduce the loop from 1000000 to 500000 since the former value
makes the test run too long and cause system issues on certain
platforms

* Limit individual allocation size using the global memory size (#1835)

Signed-off-by: Ahmed Hesham <ahmed.hesham@arm.com>

* geometrics: fix Wsign-compare warnings (#1855)

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* integer_ops: fix -Wformat warnings (#1860)

The main sources of warnings were:

 * Printing of a `size_t` which requires the `%zu` specifier.

 * Printing of `cl_long`/`cl_ulong` which is now done using the
   `PRI*64` macros to ensure portability across 32 and 64-bit builds.

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Replace OBSOLETE_FORAMT with OBSOLETE_FORMAT (#1776)

* Replace OBSOLETE_FORAMT with OBSOLETE_FORMAT

In imageHelpers.cpp and few other places in image tests, OBSOLETE_FORMAT is misspelled as OBSOLETE_FORAMT.
Fix misspelling by replcaing it with OBSOLETE_FORMAT.

Fixes #1769

* Remove code guarded by OBSOLETE_FORMAT

Remove code guarded by OBSOLETE_FORMAT
as suggested by review comments

Fixes #1769

* Fix formating issues for OBSOLETE_FORMAT changes

Fix formatting issues observed in files while removing
code guarded by OBSOLETE_FORMAT

Fixes #1769

* Some more formatting fixes

Some more formatting fixes to get CI clean

Fixes #1769

* Final Formating fixes

Final formatting fixes for #1769

* Enhancement: Thread dimensions user parameters (#1384)

* Fix format in the test scope

* Add user params to limit testing

Add parameters to reduce amount of testing.
Helpful for debugging or for machines with lower performance.

* Restore default value

* Print info only if testing params bigger than 0.

* [NFC] conversions: reenable Wunused-but-set-variable (#1845)

Remove an assigned-to but unused variable.

Reenable the Wunused-but-set-variable warning for the conversions
suite, as it now compiles cleanly with this warning enabled.

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Fix bug of conversion from long to double (#1847)

* Fix bug of conversion from long to double

It the input is long type, it should be load as long type, not ulong.

* update long2float

* math_brute_force: fix exp/exp2 rlx ULP calculation (#1848)

Fix the ULP error calculation for the `exp` and `exp2` builtins in
relaxed math mode for the full profile.

Previously, the `ulps` value kept being added to while verifying the
result buffer in a loop.  `ulps` could even become a `NaN` when the
input argument being tested was a `NaN`.

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Enable LARGEADDRESSAWARE for 32 bit compilation (#1858)

* Enable LARGEADDRESSAWARE for 32 bit compilation

32-bit executables built with MSVC linker have only 2GB virtual memory
address space by default, which might not be sufficient for some tests.

Enable LARGEADDRESSAWARE linker flag for 32-bit targets to allow tests
to handle addresses larger than 2 gigabytes.

https://learn.microsoft.com/en-us/cpp/build/reference/largeaddressaware-handle-large-addresses?view=msvc-170

Signed-off-by: Guo, Yilong <yilong.guo@intel.com>

* Apply suggestion

Co-authored-by: Ben Ashbaugh <ben.ashbaugh@intel.com>

---------

Signed-off-by: Guo, Yilong <yilong.guo@intel.com>
Co-authored-by: Ben Ashbaugh <ben.ashbaugh@intel.com>

* fix return code when readwrite image is not supported (#1873)

This function (do_test) starts by testing write and read individually.
Both of them can have errors.

When readwrite image is not supported, the function returns
TEST_SKIPPED_ITSELF potentially masking errors leading to the test
returning EXIT_SUCCESS even with errors along the way.

* fix macos builds by avoiding double compilation of function_list.cpp for test_spir (#1866)

* modernize CMakeLists for test_spir

* add the operating system release to the sccache key

* include the math brute force function list vs. building it twice

* fix the license header on the spirv-new tests (#1865)

The source files for the spirv-new tests were using the older Khronos
license instead of the proper Apache license.  Fixed the license in
all source files.

* compiler: fix grammar in error message (#1877)

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Updated semaphore tests to use clSemaphoreReImportSyncFdKHR. (#1854)

* Updated semaphore tests to use clSemaphoreReImportSyncFdKHR.

Additionally updated common semaphore code to handle spec updates
that restrict simultaneous importing/exporting of handles.

* Fix build issues on CI

* gcc build issues

* Make clReImportSemaphoreSyncFdKHR a required API
call if cl_khr_external_semaphore_sync_fd is present.

* Implement signal and wait for all semaphore types.

* subgroups: fix for testing too large WG sizes (#1620)

It seemed to be a typo; the comment says that it
tries to fetch local size for a subgroup count with
above max WG size, but it just used the previous
subgroup count.

The test on purpose sets a SG count to be a larger
number than the max work-items in the work group.
Given the minimum SG size is 1 WI, it means that there
can be a maximum of maximum work-group size of SGs (of
1 WI of size). Thus, if we request a number of SGs that
exceeds the local size, the query should fail as expected.

* add SPIR-V version testing (#1861)

* basic SPIR-V 1.3 testing support

* updated script to compile for more SPIR-V versions

* switch to general SPIR-V versions test

* update copyright text and fix license

* improve output while test is running

* check for higher SPIR-V versions first

* fix formatting

* fix the reported platform information for math brute force (#1884)

When the math brute force test printed the platform version it always
printed information for the first platform in the system, which could
be different than the platform for the passed-in device.  Fixed by
querying the platform from the passed-in device instead.

* api tests fix: Use MTdataHolder in test_get_image_info (#1871)

* Minor fixes in mutable dispatch tests. (#1829)

* Minor fixes in mutable dispatch tests.

* Fix size of newWrapper in MutableDispatchSVMArguments.
* Fix errnoneus clCommandNDRangeKernelKHR call.

Signed-off-by: John Kesapides <john.kesapides@arm.com>

* * Set the row_pitch for imageInfo in MutableDispatchImage1DArguments
and MutableDispatchImage2DArguments. The row_pitch is
used by get_image_size() to calculate the size of
the host pointers by generate_random_image_data.

Signed-off-by: John Kesapides <john.kesapides@arm.com>

---------

Signed-off-by: John Kesapides <john.kesapides@arm.com>

* add test for cl_khr_spirv_linkonce_odr (#1226)

* initial version of the test with placeholders for linkonce_odr linkage

* add OpExtension SPV_KHR_linkonce_odr extension

* add check for extension

* switch to actual LinkOnceODR linkage

* fix formatting

* add a test case to ensure a function with linkonce_odr is exported

* add back the extension check

* fix formatting

* undo compiler optimization and actually add the call to function a

* [NFC] subgroups: remove unnecessary extern keywords (#1892)

In C and C++ all functions have external linkage by default.

Also remove the unused `gMTdata` and `test_pipe_functions`
declarations.

Fixes #1137

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Added cl_khr_fp16 extension support for test_decorate from spirv_new (#1770)

* Added cl_khr_fp16 extension support for test_decorate from spirv_new, work in progres

* Complemented test_decorate saturation test to support cl_khr_fp16 extension (issue #142)

* Fixed clang format

* scope of modifications:

-changed naming convention of saturation .spvasm files related to
test_decorate of spirv_new
-restored float to char/uchar saturation tests
-few minor corrections

* fix ranges for half testing

* fix formating

* one more formatting fix

* remove unused function

* use isnan instead of std::isnan

isnan is currently implemented as a macro, not as a function, so
we can't use std::isnan.

* fix Clang warning about inexact conversion

---------

Co-authored-by: Ben Ashbaugh <ben.ashbaugh@intel.com>

* add support for custom devices (#1891)

enable the CTS to run on custom devices

---------

Signed-off-by: Ahmed Hesham <ahmed.hesham@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Signed-off-by: Guo, Yilong <yilong.guo@intel.com>
Signed-off-by: John Kesapides <john.kesapides@arm.com>
Co-authored-by: Sreelakshmi Haridas Maruthur <sharidas@quicinc.com>
Co-authored-by: Haonan Yang <haonan.yang@intel.com>
Co-authored-by: Ahmed Hesham <117350656+ahesham-arm@users.noreply.github.com>
Co-authored-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Co-authored-by: niranjanjoshi121 <43807392+niranjanjoshi121@users.noreply.github.com>
Co-authored-by: Grzegorz Wawiorko <grzegorz.wawiorko@intel.com>
Co-authored-by: Wenwan Xing <wenwan.xing@intel.com>
Co-authored-by: Yilong Guo <yilong.guo@intel.com>
Co-authored-by: Romaric Jodin <89833130+rjodinchr@users.noreply.github.com>
Co-authored-by: joshqti <127994991+joshqti@users.noreply.github.com>
Co-authored-by: Pekka Jääskeläinen <pekka.jaaskelainen@tuni.fi>
Co-authored-by: imilenkovic00 <155085410+imilenkovic00@users.noreply.github.com>
Co-authored-by: John Kesapides <46718829+JohnKesapidesARM@users.noreply.github.com>
Co-authored-by: Marcin Hajder <marcin.hajder@gmail.com>
Co-authored-by: Aharon Abramson <aharon.abramson@mobileye.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants