Skip to content

Commit

Permalink
[SYCL][Doc] Add specialization constant-length alloca extension proposal
Browse files Browse the repository at this point in the history
Document extension proposal for specialization constant length private
memory allocations. Users will be able to perform dynamic memory
allocations using specialization constants and a new `private_alloca`
function returning a `private_ptr` that will be automatically freed on
function return.

This is included as an experimental extension as implementation will
shortly follow once the extension is approved.

Signed-off-by: Victor Perez <victor.perez@codeplay.com>
  • Loading branch information
victor-eds committed Feb 8, 2024
1 parent 0eac618 commit 001b54d
Showing 1 changed file with 195 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
= sycl_ext_oneapi_spec_constant_length_alloca

:source-highlighter: coderay
:coderay-linenums-mode: table

// This section needs to be after the document title.
:doctype: book
:toc2:
:toc: left
:encoding: utf-8
:lang: en
:dpcpp: pass:[DPC++]

// Set the default source code type in this document to C++,
// for syntax highlighting purposes. This is needed because
// docbook uses c++ and html5 uses cpp.
:language: {basebackend@docbook:c++:cpp}


== Notice

[%hardbreaks]
Copyright (C) Codeplay Software Limited. All rights reserved.

Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks
of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by
permission by Khronos.


== Contact

To report problems with this extension, please open a new issue at:

https://github.com/intel/llvm/issues


== Dependencies

This extension is written against the SYCL 2020 revision 8 specification. All
references below to the "core SYCL specification" or to section numbers in the
SYCL specification refer to that revision.


== Status

This is an experimental extension specification, intended to provide early
access to features and gather community feedback. Interfaces defined in this
specification are implemented in {dpcpp}, but they are not finalized and may
change incompatibly in future versions of {dpcpp} without prior notice.
*Shipping software products should not rely on APIs defined in this
specification.*

== Overview

C++ arrays and `std::array` can be used in SYCL code to represent fixed-size
sequences of objects. However, these constructs have a significant restriction:
the number of elements must be known at compile time. In host-code context,
users can make use of dynamic memory allocations, e.g., `std::vector`, but this
is not the case in SYCL device code.

SYCL specialization constants (SYCL Section 4.9.5.) can be used to represent
constants whose values can be set dynamically during the execution of a SYCL
application, but that will not change when a SYCL kernel function is
invoked. This way, specialization constants could be used to implement SYCL
private arrays whose size is given during the execution of the SYCL
application. There is no possible way of implementing this using `std::array`,
as the size of such container must be known at compile time, so we propose to
define a new `private_alloca` function whose size is specified using SYCL
specialization constants.

[NOTE]
====
This extension only supports SPIR-V backends for now, as it relies on
SPIR-V-specific capabilities,
such as specialization constants.
====

== Specification

=== Feature test macro

This extension provides a feature-test macro as described in the core SYCL
specification. An implementation supporting this extension must predefine the
macro `SYCL_EXT_ONEAPI_SPEC_CONSTANT_LENGTH_ALLOCA` to one of the values defined
in the table below. Applications can test for the existence of this macro to
determine if the implementation supports this feature, or applications can test
the macro's value to determine which of the extension's features the
implementation supports.

[%header,cols="1,5"]
|===
|Value
|Description

|1
|The APIs of this experimental extension are not versioned, so the
feature-test macro always has this value.
|===

=== The `private_alloca` function

[source,c++]
----
namespace sycl::ext::oneapi::experimental {
template <typename ElementType, auto &SpecName,
access::decorated DecorateAddress>
private_ptr<ElementType, DecorateAddress>
private_alloca(kernel_handler &h);
} // namespace sycl::ext::oneapi::experimental
----

This extension adds a new `private_alloca` function that can be used to allocate
a private memory region with capacity for
`h.get_specialization_constant<SpecName>()` elements of type `ElementType`. The
returned pointer will be aligned for `ElementType`.

The underlying memory region is automatically freed when the caller to
`private_alloca` is returned.

In case of private memory exhaustion, the underlying backend must report an
error in the same fashion as if the allocation size were static.

If this function is called from host context or an unsupported backend, an
`exception` with the `errc::feature_not_supported` error code must be thrown.

`ElementType` must be a cv-unqualified trivial type. The return memory is
default initialized.

==== Parameters

`h`:: `sycl::kernel_handler` used to obtain the value of `SpecName`

==== Template Parameters

`ElementType`:: Cv-unqualified trivial type serving as `value_type` of the
returned `sycl::multi_ptr`.
`SpecName`:: `sycl::specialization_id` of integral `value_type`. The allocated
memory region has capacity for `h.get_specialization_constant<SpecName>`
`ElementType` elements. The default value for the specialization constant must
be at least one and the specialization constant must not be set to a value
less than one. Setting the specialization constant to a value less than 1 or
providing a default value less than 1 is undefined behaviour.
`DecorateAddress`:: Whether the returned `sycl::multi_ptr` is decorated or not.

==== Return Value

`sycl::private_ptr` to a region of `h.get_specialization_constant<SpecName>()`
elements of type `ElementType` aligned for such type. The underlying memory
region will be automatically deallocated when the function from which
`private_alloca` is called returns.

== Example usage

This non-normative section shows some example usages of the extension.

[source,c++]
----
constexpr specialization_id<int> size(1);
class Kernel;
SYCL_EXTERNAL void impl(const float *in, float *out, size_t n,
decorated_private_ptr<float> ptr);
void run(queue q, const float *in, float *out, size_t n) {
q.submit([&](handler &h) {
h.set_specialization_constant<size>(n);
h.parallel_for<Kernel>(n, [=](id<1> i, kernel_handler h) {
// Allocate memory for 'n' 'float's
auto ptr = private_alloca<float, size, access::decorated::yes>(h);
// Use pointer in implementation
impl(in, out, h.get_specialization_constant<size>(), ptr);
});
});
----

== Design Constraints

The big design constraint stems from the unknown allocation size at compile
time. C++ does not support variable length arrays and complete type sizes must
be known at compile time. Thus, the free function interface returning a pointer
to private memory is the better way to represent this construct in C++. Lifetime
of the underlying memory region was a concern too, but the current design
automatically freeing the memory when the caller is returned is in line with
similar constructs in other platforms.

== Revision History

[cols="5,15,15,70"]
[grid="rows"]
[options="header"]
|========================================
|Rev|Date|Authors|Changes
|1|2024-02-08|Victor Lomüller, Lukas Sommer, Victor Perez, Julian Oppermann, Tadej Ciglaric, Romain Biessy|*Initial draft*
|========================================

0 comments on commit 001b54d

Please sign in to comment.