Skip to content

Commit

Permalink
Publish the cl_img_memory_management extension specification. (#1202)
Browse files Browse the repository at this point in the history
* Publish the cl_img_memory_management extension specification.

* Update extensions/cl_img_memory_management.asciidoc

Listing the initial extension version.

Co-authored-by: Ben Ashbaugh <ben.ashbaugh@intel.com>

* Update cl_img_memory_management.asciidoc

Status changed to Final Draft

* Update cl_img_memory_management.asciidoc

Fix typo (unnecessary "new_alloc" in the enum item name).

---------

Co-authored-by: Ben Ashbaugh <ben.ashbaugh@intel.com>
  • Loading branch information
tomasz-platek and bashbaug committed Aug 8, 2024
1 parent b6f65dd commit aca3750
Show file tree
Hide file tree
Showing 2 changed files with 249 additions and 0 deletions.
247 changes: 247 additions & 0 deletions extensions/cl_img_memory_management.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,247 @@
:data-uri:
:icons: font
include::../config/attribs.txt[]
:source-highlighter: coderay

= cl_img_memory_management

== Name Strings

`cl_img_memory_management`

== Contact

Imagination Technologies Developer Forum: +
https://forums.imgtec.com/

Tomasz Platek, Imagination Technologies (Tomasz.Platek 'at' imgtec.com)

== Contributors

CY Cheng, Imagination Technologies. +
Tomasz Platek, Imagination Technologies.

== Notice

Copyright (c) 2024 Imagination Technologies Ltd. All Rights Reserved.

== Status

Final Draft

== Version

Built On: {docdate} +
Version: 1.0.0

== Dependencies

This extension is written against the OpenCL C Specification Version V3.0.16.

== Overview

This extension adds built-in functions that expose the low-level memory and cache control instructions of Imagination GPU IP that are not accessible by standard OpenCL C functions.

== New OpenCL C Feature Names

[source,c]
----
__opencl_img_fence
__opencl_img_cache
__opencl_img_load_store
----

== New OpenCL C Functions

Issues a data fence:

[source,c]
----
void img_fence(cache_target_img target);
----

Perform the cache flush/invalidate operation:

[source,c]
----
void img_cache_flush(cache_target_img target);
void img_cache_invalidate(cache_target_img target);
void img_cache_flush_invalidate(cache_target_img target);
----

Load to/store from memory:

[source,c]
----
gentype img_load(gentype *p, cache_coherence_img coherence, L2_cache_policy_img policy, cache_persistence_level_img persistence, bool volatile);
gentype img_load(const gentype *p, cache_coherence_img coherence, L2_cache_policy_img policy, cache_persistence_level_img persistence, bool volatile);
void img_store(gentype *p, gentype value, cache_coherence_img coherence, L2_cache_policy_img policy, cache_persistence_level_img persistence, bool volatile);
----

== Modifications to the OpenCL C Specification

(Add to Table 4 - Other Built-in Data Types in Section 6.3.3. Other Built-in Data Types) ::
+
[cols=",",options="header",]
|====
| Type | Description
| `cache_target_img`
| Target of the cache control functions. Refer to the Low-level Memory and Cache Control Functions section for a detailed description
of the built-in functions that use this enum.

Requires that the `__opencl_img_fence` or the `__opencl_img_cache` feature macro is defined.
| `cache_coherence_img`
| Level of cache coherence. Refer to the Low-level Memory and Cache Control Functions section for a detailed description
of the built-in functions that use this enum.

Requires that the `__opencl_img_load_store` feature macro is defined.
| `L2_cache_policy_img`
| Cache policy for the L2 cache. Refer to the Low-level Memory and Cache Control Functions section for a detailed description
of the built-in functions that use this enum.

Requires that the `__opencl_img_load_store` feature macro is defined.
| `cache_persistence_level_img`
| Level of cache persistence. Refer to the Low-level Memory and Cache Control Functions section for a detailed description
of the built-in functions that use this enum.

Requires that the `__opencl_img_load_store` feature macro is defined.
|====

(Add a new Section 6.15.22, *Low-level Memory and Cache Control Functions*) ::
+
--
The OpenCL C programming language implements the following built-in functions
to perform low-level memory and cache control operations:

[cols="1,2",options="header"]
|====
| Function | Description
| void *img_fence*(cache_target_img target)
a| `img_fence` issues a data fence as far as the specified `target`. For example, using `cache_target_L2_img` issues a data fence for the L1 and L2 caches.

Requires that the `__opencl_img_fence` feature macro is defined.
| void *img_cache_flush*(cache_target_img target)
a| `img_cache_flush` flushes cache, `target` determines how far through the memory hierarchy caches are flushed. For example, using `cache_target_L2_img` flushes the L1 and L2 caches.

Requires that the `__opencl_img_cache` feature macro is defined.
| void *img_cache_invalidate*(cache_target_img target)
a| `img_cache_invalidate` invalidates cache, `target` determines how far through the memory hierarchy caches are invalidated. For example, using `cache_target_L2_img` invalidates the L1 and L2 caches.

Requires that the `__opencl_img_cache` feature macro is defined.
| void *img_cache_flush_invalidate*(cache_target_img target)
a| `img_cache_flush_invalidate` flushes and invalidates cache, `target` determines how far through the memory hierarchy caches are flushed and invalidated. For example, using `cache_target_L2_img` flushes and invalidates the L1 and L2 caches.

Requires that the `__opencl_img_cache` feature macro is defined.
| gentype *img_load*(gentype pass:[*]p, cache_coherence_img coherence, L2_cache_policy_img policy, cache_persistence_level_img persistence, bool volatile) +
gentype *img_load*(const gentype pass:[*]p, cache_coherence_img coherence, L2_cache_policy_img policy, cache_persistence_level_img persistence, bool volatile)
a| `img_load` returns sizeof(gentype) bytes of data from `p`, where `coherence` specifies the level of cache coherence, `policy` specifies the cache policy for the L2 cache, `persistence` specifies the level of cache persistence, and `volatile` specifies volatility.

Requires that the `__opencl_img_load_store` feature macro is defined.
| void *img_store*(gentype pass:[*]p, gentype value, cache_coherence_img coherence, L2_cache_policy_img policy, cache_persistence_level_img persistence, bool volatile)
a| `img_store` writes 'value' to `p`, where `coherence` specifies the level of cache coherence, `policy` specifies the cache policy for the L2 cache, `persistence` specifies the level of cache persistence, and `volatile` specifies volatility.

Requires that the `__opencl_img_load_store` feature macro is defined.
|====
--

=== Cache Target

The enumerated type `cache_target_img` specifies the target of the cache control functions.
The following table lists the enumeration constants:

[cols=",",options="header",]
|====
| Cache Target | Additional Notes
| `cache_target_L1_img`
| Performs the operation on the L1 cache.

Requires that the `__opencl_img_fence` or the `__opencl_img_cache` feature macro is defined.
| `cache_target_L2_img`
| Performs the operation on the L1 and L2 caches.

Requires that the `__opencl_img_fence` or the `__opencl_img_cache` feature macro is defined.
| `cache_target_external_img`
| Performs the operation on the L1, L2, and external caches.

Requires that the `__opencl_img_fence` or the `__opencl_img_cache` feature macro is defined.
|====

=== Cache Persistence
Cache persistence modifies the priority of the request in the cache where low level means that requests are evicted quickly and high level means that requests remain in cache for a long time.
The enumerated type `cache_persistence_level_img` specifies the level of cache persistence.
The following table lists the enumeration constants:

[cols=",",options="header",]
|====
| Cache Persistence | Additional Notes
| `cache_persistence_level_default_img`
| `cache_persistence_level_min_img` is the default persistence level.

Requires that the `__opencl_img_load_store` feature macro is defined.
| `cache_persistence_level_min_img`
| Requires that the `__opencl_img_load_store` feature macro is defined.
| `cache_persistence_level_low_img`
| Requires that the `__opencl_img_load_store` feature macro is defined.
| `cache_persistence_level_high_img`
| Requires that the `__opencl_img_load_store` feature macro is defined.
| `cache_persistence_level_max_img`
| Requires that the `__opencl_img_load_store` feature macro is defined.
|====

[[cache-coherence]]
==== Cache Coherence
The enumerated type `cache_coherence_img` specifies the level of cache coherence.
The following table lists the enumeration constants:

[cols=",",options="header",]
|====
| Cache Coherence | Additional Notes
| `cache_coherence_L1_img`
| Cache coherence is guaranteed at the L1 level. Explicit flush or invalidate may be needed to ensure data coherency at higher levels.

Requires that the `__opencl_img_load_store` feature macro is defined.
| `cache_coherence_L2_img`
| Cache coherence is guaranteed at the L2 level. Explicit flush or invalidate may be needed to ensure data coherency at higher levels.

Requires that the `__opencl_img_load_store` feature macro is defined.
|====

=== L2 Cache Policy
The enumerated type `L2_cache_policy_img` specifies the cache policy for the L2 cache.
The following table lists the enumeration constants:

[cols=",",options="header",]
|====
| L2 Cache Policy | Additional Notes
| `L2_cache_policy_new_alloc_img`
| Allocates a new cache line on a cache miss.

Requires that the `__opencl_img_load_store` feature macro is defined.
| `L2_cache_policy_bypass_img`
| Permits to bypass the cache and access memory directly.

Requires that the `__opencl_img_load_store` feature macro is defined.
|====

== Coding Sample

This coding sample shows how to use the *img_load* and *img_store* functions:
[source]
----
__kernel void test(__global int *in, __global int *out) {
int a = img_load(in, cache_coherence_L1_img, L2_cache_policy_new_alloc_img, cache_persistence_level_min_img, true);
a += 1;
img_store(out, a, cache_coherence_L1_img, L2_cache_policy_new_alloc_img, cache_persistence_level_min_img, true);
}
----

== Version History

[cols="5,15,15,70"]
[grid="rows"]
[options="header"]
|====
| Version | Date | Author | Changes
| 1.0.0 | 2024-06-19 | Tomasz Platek | *Initial revision*
|====

2 changes: 2 additions & 0 deletions extensions/extensions.txt
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,8 @@ include::cl_img_generate_mipmap.asciidoc[]
<<<
include::cl_img_matrix_multiply.asciidoc[]
<<<
include::cl_img_memory_management.asciidoc[]
<<<
include::cl_img_mem_properties.asciidoc[]
<<<
include::cl_img_swap_ops.asciidoc[]
Expand Down

0 comments on commit aca3750

Please sign in to comment.