From b793218d6d80f2dcf0dd4ac3cca5e1acd62d827e Mon Sep 17 00:00:00 2001 From: tomasz-platek Date: Thu, 4 Jul 2024 14:58:20 +0200 Subject: [PATCH 1/3] Publish the cl_img_swap_ops extension specification. --- extensions/cl_img_swap_ops.asciidoc | 122 ++++++++++++++++++++++++++++ extensions/extensions.txt | 2 + 2 files changed, 124 insertions(+) create mode 100644 extensions/cl_img_swap_ops.asciidoc diff --git a/extensions/cl_img_swap_ops.asciidoc b/extensions/cl_img_swap_ops.asciidoc new file mode 100644 index 00000000..387fd64b --- /dev/null +++ b/extensions/cl_img_swap_ops.asciidoc @@ -0,0 +1,122 @@ +:data-uri: +:icons: font +include::../config/attribs.txt[] +:source-highlighter: coderay + += cl_img_swap_ops + +== Name Strings + +`cl_img_swap_ops` + +== Contact + +Imagination Technologies Developer Forum: + +https://forums.imgtec.com/ + +Tomasz Platek, Imagination Technologies (Tomasz.Platek 'at' imgtec.com) + +== Contributors + +CY Cheng, Imagination Technologies. + +Tomasz Platek, Imagination Technologies. + +== Notice + +Copyright (c) 2024 Imagination Technologies Ltd. All Rights Reserved. + +== Status + +Final Draft + +== Version + +Built On: {docdate} + +Version: Major.Minor.Patch + +== Dependencies + +This extension is written against the OpenCL C Specification Version V3.0.16. + +== Overview + +This extension adds built-in functions that exercise hardware capabilities of Imagination GPU IP and expose cross work-items swap functions. + +== New OpenCL C Feature Names + +[source,c] +---- +__opencl_img_swap +---- + +== New OpenCL C Functions + +Perform the swap operation: + +[source,c] +---- +gentype img_swap_x(gentype value); +gentype img_swap_y(gentype value); +---- + +== Modifications to the OpenCL C Specification + +(Add to Table 16 - Built-in Scalar and Vector Argument Common Functions in Section 6.15.4 - Common Functions) :: ++ +-- +[cols="1,2",options="header"] +|==== +| Function | Description +| gentype *img_swap_x*(gentype value) + a| `img_swap_x` swaps `values` between work-items in the following way: all work-items are divided into blocks of four consecutive elements. For each block: + +* In the first work-item, `img_swap_x` returns `value` passed as an argument in the second work-item. +* In the second work-item, `img_swap_x` returns `value` passed as an argument in the first work-item. +* In the third work-item, `img_swap_x` returns `value` passed as an argument in the fourth work-item. +* In the fourth work-item, `img_swap_x` returns `value` passed as an argument in the third work-item. + +Requires that the `__opencl_img_swap` feature macro is defined. +| gentype *img_swap_y*(gentype value) + a| `img_swap_y` swaps `values` between work-items in the following way: all work-items are divided into blocks of four consecutive elements. For each block: + +* In the first work-item, `img_swap_y` returns `value` passed as an argument in the third work-item. +* In the third work-item, `img_swap_y` returns `value` passed as an argument in the first work-item. +* In the second work-item, `img_swap_y` returns `value` passed as an argument in the fourth work-item. +* In the fourth work-item, `img_swap_y` returns `value` passed as an argument in the second work-item. + +Requires that the `__opencl_img_swap` feature macro is defined. +|==== +-- + +== Coding Sample + +This coding sample shows how to use the *img_swap_x* function: +[source] +---- +__kernel void swap() { + int i = get_global_id(0); + int res = img_swap_x(i); + + printf("id: %d, res = [ %d ]\n", i, res); +} +---- + +Executing four work-items of this kernel in one work-group gives the following result: +[source] +---- +id: 0, res = [ 1 ] +id: 1, res = [ 0 ] +id: 2, res = [ 3 ] +id: 3, res = [ 2 ] +---- + +== Version History + +[cols="5,15,15,70"] +[grid="rows"] +[options="header"] +|==== +| Version | Date | Author | Changes +| 1.0.0 | 2024-06-19 | Tomasz Platek | *Initial revision* +|==== + diff --git a/extensions/extensions.txt b/extensions/extensions.txt index 573ec116..59a9e024 100644 --- a/extensions/extensions.txt +++ b/extensions/extensions.txt @@ -69,6 +69,8 @@ include::cl_img_generate_mipmap.asciidoc[] <<< include::cl_img_mem_properties.asciidoc[] <<< +include::cl_img_swap_ops.asciidoc[] +<<< include::cl_img_use_gralloc_ptr.asciidoc[] <<< include::cl_img_yuv_image.asciidoc[] From db42ec79ba9490970b4e670c82aa81233c246308 Mon Sep 17 00:00:00 2001 From: tomasz-platek <165791413+tomasz-platek@users.noreply.github.com> Date: Wed, 10 Jul 2024 09:56:53 +0200 Subject: [PATCH 2/3] Update extensions/cl_img_swap_ops.asciidoc Listing the initial extension version. Co-authored-by: Ben Ashbaugh --- extensions/cl_img_swap_ops.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/extensions/cl_img_swap_ops.asciidoc b/extensions/cl_img_swap_ops.asciidoc index 387fd64b..2be11a0a 100644 --- a/extensions/cl_img_swap_ops.asciidoc +++ b/extensions/cl_img_swap_ops.asciidoc @@ -32,7 +32,7 @@ Final Draft == Version Built On: {docdate} + -Version: Major.Minor.Patch +Version: 1.0.0 == Dependencies From 8397e8551fb5670fd4dc7444d09869e8f8c56a50 Mon Sep 17 00:00:00 2001 From: tomasz-platek <165791413+tomasz-platek@users.noreply.github.com> Date: Thu, 11 Jul 2024 10:55:12 +0200 Subject: [PATCH 3/3] Update cl_img_swap_ops.asciidoc Defining behavior as undefined for cases when the number of work-items is not evenly divisible by four and if some work-items in the block of four are inactive, defining 1-dimensional local ID as a base for grouping work-items. --- extensions/cl_img_swap_ops.asciidoc | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/extensions/cl_img_swap_ops.asciidoc b/extensions/cl_img_swap_ops.asciidoc index 2be11a0a..ea957802 100644 --- a/extensions/cl_img_swap_ops.asciidoc +++ b/extensions/cl_img_swap_ops.asciidoc @@ -75,6 +75,12 @@ gentype img_swap_y(gentype value); * In the third work-item, `img_swap_x` returns `value` passed as an argument in the fourth work-item. * In the fourth work-item, `img_swap_x` returns `value` passed as an argument in the third work-item. +The work-items are assigned into blocks based on their 1-dimensional local ID (see `get_local_linear_id`). + +The number of work-items that make up a work-group must be evenly divisible by four; otherwise, the behaviour is undefined. + +The function must be called in all four work-items of the block; otherwise, the behaviour is undefined. + Requires that the `__opencl_img_swap` feature macro is defined. | gentype *img_swap_y*(gentype value) a| `img_swap_y` swaps `values` between work-items in the following way: all work-items are divided into blocks of four consecutive elements. For each block: @@ -84,6 +90,12 @@ Requires that the `__opencl_img_swap` feature macro is defined. * In the second work-item, `img_swap_y` returns `value` passed as an argument in the fourth work-item. * In the fourth work-item, `img_swap_y` returns `value` passed as an argument in the second work-item. +The work-items are assigned into blocks based on their 1-dimensional local ID (see `get_local_linear_id`). + +The number of work-items that make up a work-group must be evenly divisible by four; otherwise, the behaviour is undefined. + +The function must be called in all four work-items of the block; otherwise, the behaviour is undefined. + Requires that the `__opencl_img_swap` feature macro is defined. |==== --