Skip to content

Commit

Permalink
Enhancement and bug fixes for 1.9.3 release
Browse files Browse the repository at this point in the history
Misc infrastructure updates, update sampler support for bindless images, update new format support for images
  • Loading branch information
pbg-intel authored May 3, 2024
1 parent 88819b5 commit 17336b7
Show file tree
Hide file tree
Showing 15 changed files with 422 additions and 144 deletions.
9 changes: 5 additions & 4 deletions scripts/Doxyfile
Original file line number Diff line number Diff line change
Expand Up @@ -1226,7 +1226,7 @@ HTML_COLORSTYLE_GAMMA = 80
# The default value is: NO.
# This tag requires that the tag GENERATE_HTML is set to YES.

HTML_TIMESTAMP = YES
# HTML_TIMESTAMP = YES

# If the HTML_DYNAMIC_MENUS tag is set to YES then the generated HTML
# documentation will contain a main index with vertical navigation menus that
Expand Down Expand Up @@ -1662,7 +1662,7 @@ EXTRA_SEARCH_MAPPINGS =
# If the GENERATE_LATEX tag is set to YES, doxygen will generate LaTeX output.
# The default value is: YES.

GENERATE_LATEX = YES
GENERATE_LATEX = NO

# The LATEX_OUTPUT tag is used to specify where the LaTeX docs will be put. If a
# relative path is entered the value of OUTPUT_DIRECTORY will be put in front of
Expand Down Expand Up @@ -1854,7 +1854,7 @@ LATEX_BIB_STYLE = plain
# The default value is: NO.
# This tag requires that the tag GENERATE_LATEX is set to YES.

LATEX_TIMESTAMP = YES
# LATEX_TIMESTAMP = YES

# The LATEX_EMOJI_DIRECTORY tag is used to specify the (relative or absolute)
# path from which the emoji images will be read. If a relative path is entered,
Expand Down Expand Up @@ -2142,7 +2142,8 @@ INCLUDE_FILE_PATTERNS =
# recursively expanded use the := operator instead of the = operator.
# This tag requires that the tag ENABLE_PREPROCESSING is set to YES.

PREDEFINED = __cplusplus
PREDEFINED = __cplusplus \
"module=modul3"

# If the MACRO_EXPANSION and EXPAND_ONLY_PREDEF tags are set to YES then this
# tag can be used to specify a list of macro names that should be expanded. The
Expand Down
106 changes: 105 additions & 1 deletion scripts/core/EXT_EXP_BindlessImages.rst
Original file line number Diff line number Diff line change
Expand Up @@ -47,10 +47,17 @@ In this extension, we propose the following additions:
* Provide a new image descriptor and flags for Bindless images.
* Support for creation of images on linearly allocated memory backed by USM.
* Extension API to create an image handle from pitched memory
* Create Bindless sampled images

A "Bindless image" can be created by passing ${x}_image_bindless_exp_desc_t to pNext member of
${x}_image_desc_t and set the flags value as ${X}_IMAGE_BINDLESS_EXP_FLAG_BINDLESS

A "Bindless sampled image" can be created by passing ${x}_image_bindless_exp_desc_t to pNext member of
${x}_image_desc_t and setting the flags to a combination of ${X}_IMAGE_BINDLESS_EXP_FLAG_BINDLESS and ${X}_IMAGE_BINDLESS_EXP_FLAG_SAMPLED_IMAGE
When image view is created from bindless sampled image, sampling modes can be redefined by passing sampler descriptor in pNext field of ${x}_image_bindless_exp_desc_t struct.
Image view created from bindless sampled image without setting ${X}_IMAGE_BINDLESS_EXP_FLAG_SAMPLED_IMAGE is an unsampled image.
Sampled image view can be created from bindless unsampled image by setting ${X}_IMAGE_BINDLESS_EXP_FLAG_SAMPLED_IMAGE and passing sampler descriptor in pNext field of ${x}_image_bindless_exp_desc_t struct.

This extension is complimentary to and may be used in conjunction with the `ZE_extension_image_view <https://spec.oneapi.io/level-zero/latest/core/EXT_ImageView.html#image-view-extension>`_ extension

Programming example with Bindless images
Expand Down Expand Up @@ -178,4 +185,101 @@ Programming example with pitched memory usage
// Once all operations on the image are complete we need destroy image handle and free memory
${x}ImageDestroy(hImage);
${x}MemFree(hContext, pitchedPtr);
${x}MemFree(hContext, pitchedPtr);
Programming example with Bindless sampled images
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. parsed-literal::
// 2D image dimensions
size_t imageWidth = 1024;
size_t imageHeight = 1024;
// Single-precision float image format with one channel
${x}_image_format_t imageFormat = {
ZE_IMAGE_FORMAT_LAYOUT_32, ZE_IMAGE_FORMAT_TYPE_FLOAT,
ZE_IMAGE_FORMAT_SWIZZLE_R, ZE_IMAGE_FORMAT_SWIZZLE_X,
ZE_IMAGE_FORMAT_SWIZZLE_R, ZE_IMAGE_FORMAT_SWIZZLE_X
}
// Define sampler descriptor
${x}_sampler_desc_t samplerDesc = {
ZE_STRUCTURE_TYPE_SAMPLER_DESC,
nullptr,
ZE_SAMPLER_ADDRESS_MODE_CLAMP,
ZE_SAMPLER_FILTER_MODE_LINEAR,
true
};
// Create an image descriptor for bindless image
${x}_image_desc_t imageDesc = {
ZE_STRUCTURE_TYPE_IMAGE_DESC,
nullptr,
0,
ZE_IMAGE_TYPE_2D,
imageFormat,
imageWidth, imageHeight, 0, 0, 0
};
${x}_image_bindless_exp_desc_t bindlessImageDesc = {ZE_STRUCTURE_TYPE_BINDLESS_IMAGE_EXP_DESC};
bindlessImageDesc.flags = ZE_IMAGE_BINDLESS_EXP_FLAG_BINDLESS | ZE_IMAGE_BINDLESS_EXP_FLAG_SAMPLED_IMAGE;
imageDesc.pNext = &bindlessImageDesc;
bindlessImageDesc.pNext = &samplerDesc;
// Create bindless sampled image
// pass ZE_IMAGE_BINDLESS_EXP_FLAG_BINDLESS and ZE_IMAGE_BINDLESS_EXP_FLAG_SAMPLED_IMAGE to zeImageCreate(),
${x}_image_handle_t hImage;
${x}ImageCreate(hContext, hDevice, &imageDesc, &hImage);
// Create an image view from bindless sampled image
// define sampler descriptor for view
${x}_sampler_desc_t samplerDescForView = {
ZE_STRUCTURE_TYPE_SAMPLER_DESC,
nullptr,
ZE_SAMPLER_ADDRESS_MODE_CLAMP,
ZE_SAMPLER_FILTER_MODE_NEAREST,
true
};
${x}_image_format_t imageViewFormat = {
ZE_IMAGE_FORMAT_LAYOUT_32, ZE_IMAGE_FORMAT_TYPE_UINT,
ZE_IMAGE_FORMAT_SWIZZLE_R, ZE_IMAGE_FORMAT_SWIZZLE_X,
ZE_IMAGE_FORMAT_SWIZZLE_R, ZE_IMAGE_FORMAT_SWIZZLE_X
}
// image descriptor for bindless image view
${x}_image_desc_t imageViewDesc = {
ZE_STRUCTURE_TYPE_IMAGE_DESC,
nullptr,
0,
ZE_IMAGE_TYPE_2D,
imageViewFormat,
128, 128, 0, 0, 0
};
imageViewDesc.pNext = &bindlessImageDesc;
bindlessImageDesc.pNext = &samplerDescForView;
${x}_image_handle_t hImageView;
${x}ImageViewCreateExt(hContext, hDevice, &imageViewDesc, hImage, &hImageView);
// If ZE_IMAGE_BINDLESS_EXP_FLAG_SAMPLED_IMAGE is not set, unsampled image is created
${x}_image_handle_t hUnsampledImageView;
bindlessImageDesc.flags = ZE_IMAGE_BINDLESS_EXP_FLAG_BINDLESS;
bindlessImageDesc.pNext = nullptr;
${x}ImageViewCreateExt(hContext, hDevice, &imageViewDesc, hImage, &hUnsampledImageView);
// Create an image view from bindless unsampled image
${x}_image_handle_t hUnsampledImage;
${x}_image_handle_t hSampledImageView;
bindlessImageDesc.flags = ZE_IMAGE_BINDLESS_EXP_FLAG_BINDLESS;
bindlessImageDesc.pNext = nullptr;
imageDesc.pNext = &bindlessImageDesc;
// create unsampled image
${x}ImageCreate(hContext, hDevice, &imageDesc, &hUnsampledImage);
bindlessImageDesc.flags = ZE_IMAGE_BINDLESS_EXP_FLAG_BINDLESS | ZE_IMAGE_BINDLESS_EXP_FLAG_SAMPLED_IMAGE;
bindlessImageDesc.pNext = &samplerDescForView;
${x}ImageViewCreateExt(hContext, hDevice, &imageDesc, hUnsampledImage, &hSampledImageView);
7 changes: 6 additions & 1 deletion scripts/core/EXT_Exp_ImageView.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,11 @@ from templates import helper as th
x=tags['$x']
X=x.upper()
%>

<%!
from parse_specs import _version_compare_gequal
%>

:orphan:

.. _ZE_experimental_image_view:
Expand All @@ -14,7 +19,7 @@ from templates import helper as th
Image View Extension
=========================

%if ver >= 1.5:
%if _version_compare_gequal(ver, "1.5"):
This experimental extension is deprecated and replaced by the :ref:`${th.subt(namespace, tags, X)}_extension_image_view <${th.subt(namespace, tags, X)}_extension_image_view>` standard extension.
%endif

Expand Down
7 changes: 6 additions & 1 deletion scripts/core/EXT_Exp_ImageViewPlanar.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,11 @@ from templates import helper as th
x=tags['$x']
X=x.upper()
%>

<%!
from parse_specs import _version_compare_gequal
%>

:orphan:

.. _ZE_experimental_image_view_planar:
Expand All @@ -14,7 +19,7 @@ from templates import helper as th
Image View Planar Extension
=============================

%if ver >= 1.5:
%if _version_compare_gequal(ver, "1.5"):
This experimental extension is deprecated and replaced by the :ref:`${th.subt(namespace, tags, X)}_extension_image_view_planar <${th.subt(namespace, tags, X)}_extension_image_view_planar>` standard extension.
%endif

Expand Down
41 changes: 23 additions & 18 deletions scripts/core/PROG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,11 @@
x=tags['$x']
X=x.upper()
%>

<%!
from parse_specs import _version_compare_less, _version_compare_gequal
%>

.. _core-programming-guide:

========================
Expand Down Expand Up @@ -47,7 +52,7 @@ The following diagram illustrates the relationship between the driver, device an

.. image:: ../images/core_device.png

%if ver >= 1.7:
%if _version_compare_gequal(ver, "1.7"):
Level Zero device model hierarchy is composed of **Root Devices** and **Sub-Devices**: A root-device may contain two or more sub-devices and a sub-device shall belong to a single root-device.
A root-device may not contain a single sub-device, as that would be the same root-device. A root device may also be a device with no sub-devices.

Expand Down Expand Up @@ -620,10 +625,10 @@ External memory handles may be imported from other APIs, or exported for use in
Importing and exporting external memory is an optional feature.
Devices may describe the types of external memory handles they support using ${x}DeviceGetExternalMemoryProperties.

%if ver >= 1.5:
%if _version_compare_gequal(ver, "1.5"):
Importing and exporting external memory is supported for device and host memory allocations and images.
%endif
%if ver < 1.5:
%if _version_compare_less(ver, "1.5"):
Importing and exporting external memory is supported for device memory allocations and images.
%endif

Expand Down Expand Up @@ -1104,10 +1109,10 @@ A kernel timestamp event is a special type of event that records device timestam
.. parsed-literal::
// Get timestamp frequency
%if ver >= 1.1:
%if _version_compare_gequal(ver, "1.1"):
const double timestampFreq = NS_IN_SEC / device_properties.timerResolution;
%endif
%if ver < 1.1:
%if _version_compare_less(ver, "1.1"):
const uint64_t timestampFreq = device_properties.timerResolution;
%endif
const uint64_t timestampMaxValue = ~(-1L << device_properties.kernelTimestampValidBits);
Expand Down Expand Up @@ -1712,7 +1717,7 @@ Environment Variables

The following table documents the supported knobs for overriding default functional behavior.

%if ver < 1.7:
%if _version_compare_less(ver, "1.7"):

+-----------------+-------------------------------------+------------+-----------------------------------------------------------------------------------+
| Category | Name | Values | Description |
Expand All @@ -1726,7 +1731,7 @@ The following table documents the supported knobs for overriding default functio

%endif

%if ver >= 1.7:
%if _version_compare_gequal(ver, "1.7"):

+-----------------+-------------------------------------+-----------------------------------+-----------------------------------------------------------------------------------+
| Category | Name | Values | Description |
Expand Down Expand Up @@ -1766,7 +1771,7 @@ The values are specific to system configuration; e.g., the number of devices and
The values are specific to the order in which devices are reported by the driver; i.e., the first device maps to ordinal 0, the second device to ordinal 1, and so forth.
If the affinity mask is not set, then all devices and sub-devices are reported; as is the default behavior.

%if ver >= 1.7:
%if _version_compare_gequal(ver, "1.7"):
The affinity mask masks the devices as defined by value set in the ${X}_FLAT_DEVICE_HIERARCHY environment variable, i.e., a Level Zero driver shall read
first ${X}_FLAT_DEVICE_HIERARCHY to determine the device handles to be used by the application and then interpret the values passed in ${X}_AFFINITY_MASK
based on the device model selected.
Expand All @@ -1776,7 +1781,7 @@ The order of the devices reported by the ${x}DeviceGet is implementation-specifi

The order of the devices reported by the ${x}DeviceGet can be forced to be consistent by setting the ${X}_ENABLE_PCI_ID_DEVICE_ORDER environment variable.

%if ver < 1.7:
%if _version_compare_less(ver, "1.7"):
The following examples demonstrate proper usage for a system configuration of two devices, each with four sub-devices:

- `0, 1`: all devices and sub-devices are reported (same as default)
Expand All @@ -1788,7 +1793,7 @@ The following examples demonstrate proper usage for a system configuration of tw

%endif

%if ver >= 1.7:
%if _version_compare_gequal(ver, "1.7"):
The following examples demonstrate proper usage for a system configuration composed of two physical devices, each of which can be further
sub-divided into four smaller devices. For the purpose of these examples, we will refer to the two physical devices as `parent devices`
and to the smaller sub-devices as `tiles`.
Expand Down Expand Up @@ -2125,10 +2130,10 @@ such as multiple levels of indirection, there are two methods available:

+ If the driver is unable to make all allocations resident, then the call to ${x}CommandQueueExecuteCommandLists will return ${X}_RESULT_ERROR_OUT_OF_DEVICE_MEMORY

%if ver >= 1.6:
%if _version_compare_gequal(ver, "1.6"):
2. Explicit ${x}ContextMakeMemoryResident APIs are included for the application to dynamically change residency as needed.
%endif
%if ver < 1.6:
%if _version_compare_less(ver, "1.6"):
2. Explicit ${x}ContextMakeMemoryResident APIs are included for the application to dynamically change residency as needed. (Windows-only)
%endif

Expand Down Expand Up @@ -2283,18 +2288,18 @@ The following code examples demonstrate how to use the memory IPC APIs:
${x}MemCloseIpcHandle(hContext, dptr);
%if ver >= 1.6:
%if _version_compare_gequal(ver, "1.6"):
5. Finally, return the IPC handle to the driver with ${x}MemPutIpcHandle and
free the device pointer in the sending process. If ${x}MemPutIpcHandle is not called,
any actions performed by that call are eventually done by ${x}MemFree.
%endif
%if ver < 1.6:
%if _version_compare_less(ver, "1.6"):
5. Finally, free the device pointer in the sending process:
%endif

.. parsed-literal::
%if ver >= 1.6:
%if _version_compare_gequal(ver, "1.6"):
${x}MemPutIpcHandle(hContext, hIpc);
%endif
${x}MemFree(hContext, dptr);
Expand Down Expand Up @@ -2384,19 +2389,19 @@ Note, there is no guaranteed address equivalence for the values of ``hEvent`` in
${x}EventDestroy(hEvent);
${x}EventPoolCloseIpcHandle(&hEventPool);
%if ver >= 1.6:
%if _version_compare_gequal(ver, "1.6"):
5. Finally, return the IPC handle to the driver with ${x}EventPoolPutIpcHandle and
free the event pool in the sending process. If ${x}EventPoolPutIpcHandle is not called,
any actions performed by that call are eventually done by ${x}EventPoolDestroy.
%endif
%if ver < 1.6:
%if _version_compare_less(ver, "1.6"):
5. Finally, free the event pool handle in the sending process:
%endif

.. parsed-literal::
${x}EventDestroy(hEvent);
%if ver >= 1.6:
%if _version_compare_gequal(ver, "1.6"):
${x}EventPoolPutIpcHandle(hContext, hIpcEventPool);
%endif
${x}EventPoolDestroy(hEventPool);
Expand Down
12 changes: 8 additions & 4 deletions scripts/core/SPIRV.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ from templates import helper as th
x=tags['$x']
X=x.upper()
%>
<%!
from parse_specs import _version_compare_gequal
%>

==========================
SPIR-V Programming Guide
==========================
Expand Down Expand Up @@ -412,7 +416,7 @@ The following restrictions apply to the
words, the write must begin at a 32-bit boundary. There is no
restriction on the y-component of the coordinate.

%if ver >= 1.1:
%if _version_compare_gequal(ver, "1.1"):
Floating-Point Atomics
----------------------

Expand Down Expand Up @@ -459,7 +463,7 @@ Additionally:

%endif

%if ver >= 1.2:
%if _version_compare_gequal(ver, "1.2"):
Extended Subgroups
------------------

Expand Down Expand Up @@ -651,7 +655,7 @@ optional *ClusterSize* operand.

%endif

%if ver >= 1.2:
%if _version_compare_gequal(ver, "1.2"):
Linkonce ODR
------------

Expand All @@ -664,7 +668,7 @@ include the **LinkOnceODR** linkage type.

%endif

%if ver >= 1.5:
%if _version_compare_gequal(ver, "1.5"):
Bfloat16 Conversions
--------------------

Expand Down
Loading

0 comments on commit 17336b7

Please sign in to comment.