Skip to content

Commit

Permalink
Publish the cl_img_swap_ops extension specification. (#1201)
Browse files Browse the repository at this point in the history
* Publish the cl_img_swap_ops extension specification.

* Update extensions/cl_img_swap_ops.asciidoc

Listing the initial extension version.

Co-authored-by: Ben Ashbaugh <ben.ashbaugh@intel.com>

* Update cl_img_swap_ops.asciidoc

Defining behavior as undefined for cases when the number of work-items is not evenly divisible by four and if some work-items in the block of four are inactive, defining 1-dimensional local ID as a base for grouping work-items.

---------

Co-authored-by: Ben Ashbaugh <ben.ashbaugh@intel.com>
  • Loading branch information
tomasz-platek and bashbaug committed Aug 8, 2024
1 parent b648551 commit b6f65dd
Show file tree
Hide file tree
Showing 2 changed files with 136 additions and 0 deletions.
134 changes: 134 additions & 0 deletions extensions/cl_img_swap_ops.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
:data-uri:
:icons: font
include::../config/attribs.txt[]
:source-highlighter: coderay

= cl_img_swap_ops

== Name Strings

`cl_img_swap_ops`

== Contact

Imagination Technologies Developer Forum: +
https://forums.imgtec.com/

Tomasz Platek, Imagination Technologies (Tomasz.Platek 'at' imgtec.com)

== Contributors

CY Cheng, Imagination Technologies. +
Tomasz Platek, Imagination Technologies.

== Notice

Copyright (c) 2024 Imagination Technologies Ltd. All Rights Reserved.

== Status

Final Draft

== Version

Built On: {docdate} +
Version: 1.0.0

== Dependencies

This extension is written against the OpenCL C Specification Version V3.0.16.

== Overview

This extension adds built-in functions that exercise hardware capabilities of Imagination GPU IP and expose cross work-items swap functions.

== New OpenCL C Feature Names

[source,c]
----
__opencl_img_swap
----

== New OpenCL C Functions

Perform the swap operation:

[source,c]
----
gentype img_swap_x(gentype value);
gentype img_swap_y(gentype value);
----

== Modifications to the OpenCL C Specification

(Add to Table 16 - Built-in Scalar and Vector Argument Common Functions in Section 6.15.4 - Common Functions) ::
+
--
[cols="1,2",options="header"]
|====
| Function | Description
| gentype *img_swap_x*(gentype value)
a| `img_swap_x` swaps `values` between work-items in the following way: all work-items are divided into blocks of four consecutive elements. For each block:

* In the first work-item, `img_swap_x` returns `value` passed as an argument in the second work-item.
* In the second work-item, `img_swap_x` returns `value` passed as an argument in the first work-item.
* In the third work-item, `img_swap_x` returns `value` passed as an argument in the fourth work-item.
* In the fourth work-item, `img_swap_x` returns `value` passed as an argument in the third work-item.

The work-items are assigned into blocks based on their 1-dimensional local ID (see `get_local_linear_id`).

The number of work-items that make up a work-group must be evenly divisible by four; otherwise, the behaviour is undefined.

The function must be called in all four work-items of the block; otherwise, the behaviour is undefined.

Requires that the `__opencl_img_swap` feature macro is defined.
| gentype *img_swap_y*(gentype value)
a| `img_swap_y` swaps `values` between work-items in the following way: all work-items are divided into blocks of four consecutive elements. For each block:

* In the first work-item, `img_swap_y` returns `value` passed as an argument in the third work-item.
* In the third work-item, `img_swap_y` returns `value` passed as an argument in the first work-item.
* In the second work-item, `img_swap_y` returns `value` passed as an argument in the fourth work-item.
* In the fourth work-item, `img_swap_y` returns `value` passed as an argument in the second work-item.

The work-items are assigned into blocks based on their 1-dimensional local ID (see `get_local_linear_id`).

The number of work-items that make up a work-group must be evenly divisible by four; otherwise, the behaviour is undefined.

The function must be called in all four work-items of the block; otherwise, the behaviour is undefined.

Requires that the `__opencl_img_swap` feature macro is defined.
|====
--

== Coding Sample

This coding sample shows how to use the *img_swap_x* function:
[source]
----
__kernel void swap() {
int i = get_global_id(0);
int res = img_swap_x(i);
printf("id: %d, res = [ %d ]\n", i, res);
}
----

Executing four work-items of this kernel in one work-group gives the following result:
[source]
----
id: 0, res = [ 1 ]
id: 1, res = [ 0 ]
id: 2, res = [ 3 ]
id: 3, res = [ 2 ]
----

== Version History

[cols="5,15,15,70"]
[grid="rows"]
[options="header"]
|====
| Version | Date | Author | Changes
| 1.0.0 | 2024-06-19 | Tomasz Platek | *Initial revision*
|====

2 changes: 2 additions & 0 deletions extensions/extensions.txt
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,8 @@ include::cl_img_matrix_multiply.asciidoc[]
<<<
include::cl_img_mem_properties.asciidoc[]
<<<
include::cl_img_swap_ops.asciidoc[]
<<<
include::cl_img_use_gralloc_ptr.asciidoc[]
<<<
include::cl_img_yuv_image.asciidoc[]
Expand Down

0 comments on commit b6f65dd

Please sign in to comment.