-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Publish the cl_img_swap_ops extension specification. #1201
Merged
Merged
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,134 @@ | ||
:data-uri: | ||
:icons: font | ||
include::../config/attribs.txt[] | ||
:source-highlighter: coderay | ||
|
||
= cl_img_swap_ops | ||
|
||
== Name Strings | ||
|
||
`cl_img_swap_ops` | ||
|
||
== Contact | ||
|
||
Imagination Technologies Developer Forum: + | ||
https://forums.imgtec.com/ | ||
|
||
Tomasz Platek, Imagination Technologies (Tomasz.Platek 'at' imgtec.com) | ||
|
||
== Contributors | ||
|
||
CY Cheng, Imagination Technologies. + | ||
Tomasz Platek, Imagination Technologies. | ||
|
||
== Notice | ||
|
||
Copyright (c) 2024 Imagination Technologies Ltd. All Rights Reserved. | ||
|
||
== Status | ||
|
||
Final Draft | ||
|
||
== Version | ||
|
||
Built On: {docdate} + | ||
Version: 1.0.0 | ||
|
||
== Dependencies | ||
|
||
This extension is written against the OpenCL C Specification Version V3.0.16. | ||
|
||
== Overview | ||
|
||
This extension adds built-in functions that exercise hardware capabilities of Imagination GPU IP and expose cross work-items swap functions. | ||
|
||
== New OpenCL C Feature Names | ||
|
||
[source,c] | ||
---- | ||
__opencl_img_swap | ||
---- | ||
|
||
== New OpenCL C Functions | ||
|
||
Perform the swap operation: | ||
|
||
[source,c] | ||
---- | ||
gentype img_swap_x(gentype value); | ||
gentype img_swap_y(gentype value); | ||
---- | ||
|
||
== Modifications to the OpenCL C Specification | ||
|
||
(Add to Table 16 - Built-in Scalar and Vector Argument Common Functions in Section 6.15.4 - Common Functions) :: | ||
+ | ||
-- | ||
[cols="1,2",options="header"] | ||
|==== | ||
| Function | Description | ||
| gentype *img_swap_x*(gentype value) | ||
a| `img_swap_x` swaps `values` between work-items in the following way: all work-items are divided into blocks of four consecutive elements. For each block: | ||
|
||
* In the first work-item, `img_swap_x` returns `value` passed as an argument in the second work-item. | ||
* In the second work-item, `img_swap_x` returns `value` passed as an argument in the first work-item. | ||
* In the third work-item, `img_swap_x` returns `value` passed as an argument in the fourth work-item. | ||
* In the fourth work-item, `img_swap_x` returns `value` passed as an argument in the third work-item. | ||
|
||
The work-items are assigned into blocks based on their 1-dimensional local ID (see `get_local_linear_id`). | ||
|
||
The number of work-items that make up a work-group must be evenly divisible by four; otherwise, the behaviour is undefined. | ||
|
||
The function must be called in all four work-items of the block; otherwise, the behaviour is undefined. | ||
|
||
Requires that the `__opencl_img_swap` feature macro is defined. | ||
| gentype *img_swap_y*(gentype value) | ||
a| `img_swap_y` swaps `values` between work-items in the following way: all work-items are divided into blocks of four consecutive elements. For each block: | ||
|
||
* In the first work-item, `img_swap_y` returns `value` passed as an argument in the third work-item. | ||
* In the third work-item, `img_swap_y` returns `value` passed as an argument in the first work-item. | ||
* In the second work-item, `img_swap_y` returns `value` passed as an argument in the fourth work-item. | ||
* In the fourth work-item, `img_swap_y` returns `value` passed as an argument in the second work-item. | ||
|
||
The work-items are assigned into blocks based on their 1-dimensional local ID (see `get_local_linear_id`). | ||
|
||
The number of work-items that make up a work-group must be evenly divisible by four; otherwise, the behaviour is undefined. | ||
|
||
The function must be called in all four work-items of the block; otherwise, the behaviour is undefined. | ||
|
||
Requires that the `__opencl_img_swap` feature macro is defined. | ||
|==== | ||
-- | ||
|
||
== Coding Sample | ||
|
||
This coding sample shows how to use the *img_swap_x* function: | ||
[source] | ||
---- | ||
__kernel void swap() { | ||
int i = get_global_id(0); | ||
int res = img_swap_x(i); | ||
|
||
printf("id: %d, res = [ %d ]\n", i, res); | ||
} | ||
---- | ||
|
||
Executing four work-items of this kernel in one work-group gives the following result: | ||
[source] | ||
---- | ||
id: 0, res = [ 1 ] | ||
id: 1, res = [ 0 ] | ||
id: 2, res = [ 3 ] | ||
id: 3, res = [ 2 ] | ||
---- | ||
|
||
== Version History | ||
|
||
[cols="5,15,15,70"] | ||
[grid="rows"] | ||
[options="header"] | ||
|==== | ||
| Version | Date | Author | Changes | ||
| 1.0.0 | 2024-06-19 | Tomasz Platek | *Initial revision* | ||
|==== | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider documenting the behavior if the number of work-items is not evenly divisible by four, or if some work-items in the block of four are inactive (due to flow control).
Also consider documenting what "consecutive" means, especially for multi-dimensional work-groups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I defined the behavior as undefined for these cases as I don't see any reasonable way of handling them.