Skip to content

Latest commit

 

History

History
919 lines (483 loc) · 61.9 KB

controls.md

File metadata and controls

919 lines (483 loc) · 61.9 KB

How to Use the Intercept Layer for OpenCL Applications

This file is automatically generated using the script generate_controls_doc.py. Please do not edit it manually!

By default, the Intercept Layer for OpenCL Applications will not modify any OpenCL calls. You may notice some status messages being printed to stderr, but otherwise your application should run exactly as it does without the Intercept Layer for OpenCL Applications.

Controls

The Intercept Layer for OpenCL Applications is controlled using the Windows registry, Linux configuration files, or environment variables on all OSes.

Windows Registry

On Windows, the Intercept Layer for OpenCL Applications reads its registry keys from:

HKEY_CURRENT_USER\SOFTWARE\INTEL\IGFX\CLIntercept

This is the recommended registry location as it has several advantages over HKEY_LOCAL_MACHINE: modifying the registry keys does not require Administrator access, registry keys do not need to be set in multiple places, and each user can set their own registry keys without affecting other users.

For backwards compatibility, the Intercept Layer for OpenCL applications will still read registry keys from:

// For 32-bit systems, or 64-bit applications on a 64-bit system:
HKEY_LOCAL_MACHINE\SOFTWARE\INTEL\IGFX\CLIntercept

// For 32-bit applications on a 64-bit system:
HKEY_LOCAL_MACHINE\SOFTWARE\WoW6432Node\INTEL\IGFX\CLIntercept

If a registry is set in both HKCU and HKLM, the setting in HKCU will "win".

Linux, Android, and OSX Configuration Files

On Linux, the Intercept Layer for OpenCL Applications will read control values from a config file named clintercept.conf. Controls in a config file may be set by putting the control on its own line, followed by an equals sign, followed by the value to set the control to. Lines that begin with a semi-colon (";"), a hash mark ("#"), or a C++-style comment ("//") are ignored. For example, to enable CallLogging, put a line in your clintercept.conf that looks like:

// Enable CallLogging:
CallLogging=1

The Intercept Layer for OpenCL Applications will search for config files in the following directories, in order:

  • The current user's HOME directory (~).
  • The sdcard directory (for Android only).
  • The system directory /etc/OpenCL.

Environment Variables

The Intercept Layer for OpenCL may be controlled using environment variables. The name of the environment variable control is "CLI_" and the control name, to distinguish controls from other environment variables, and to make it easy to list all of the environment variable controls. So, to enable CallLogging, you could type:

export CLI_CallLogging=1

To disable CallLogging, you could type:

unset CLI_CallLogging

To list all environment variable controls, you could type:

env | grep CLI_

Case Sensitivity

Controls should generally be considered case sensitive. Some methods of setting controls on some operating systems may treat controls as case insensitive, but it is unsafe to rely on case insensitive behavior.

Control Verification

To verify that a control has been set correctly and is taking effect, please check the Intercept Layer for OpenCL Applications log file, which will record any controls that are set to non-default values.

Setup and Loading Controls

OpenCLFileName (string)

Used to control the DLL or Shared Library that the Intercept Layer for OpenCL Applications loads to make real OpenCL calls. This can be a relative file name or a full absolute file name, but an absolute file name is recommended. If present, only this file name is loaded. If omitted, the Intercept Layer for OpenCL Applications tries to load the real OpenCL from file names in this order:

For Windows:

  • real_OpenCL.dll (anywhere in the system path)
  • %WINDIR%\SysWOW64\OpenCL.dll (32-bit DLLs only)
  • %WINDIR%\System32\OpenCL.dll

For Linux:

  • ./real_libOpenCL.so
  • /usr/lib/x86_64-linux-gnu/libOpenCL.so.1 (optional, for systems with a detected multi-arch specifier)
  • /usr/lib/x86_64-linux-gnu/libOpenCL.so
  • /usr/lib/libOpenCL.so.1
  • /usr/lib/libOpenCL.so
  • /usr/local/lib/libOpenCL.so.1
  • /usr/local/lib/libOpenCL.so
  • /opt/intel/opencl/lib64/libOpenCL.so.1
  • /opt/intel/opencl/lib64/libOpenCL.so
  • /glob/development-tools/oneapi/inteloneapi/compiler/latest/linux/lib/libOpenCL.so.1
  • /glob/development-tools/oneapi/inteloneapi/compiler/latest/linux/lib/libOpenCL.so

For Android:

  • /system/vendor/lib/real_libOpenCL.so
  • real_libOpenCL.so

This control is not used for OSX.

This control used to be called DllName. The old name may still be used for backwards compatibility, but switching to the new name is recommended.

BreakOnLoad (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will break into the debugger when it is loaded.

Logging Controls

SuppressLogging (bool)

If set to a nonzero value, suppresses all logging output from the Intercept Layer for OpenCL Applications. This is particularly useful for tools that only want report data.

AppendFiles (bool)

By default, the Intercept Layer for OpenCL Applications log files will be created from scratch when the intercept DLL is loaded, and any Intercept Layer for OpenCL Applications report files will be created from scratch when the intercept DLL is unloaded. If AppendFiles is set to a nonzero value, the Intercept Layer for OpenCL Applications will append to an existing file instead of recreating it. This can be useful if an application loads and unloads the intercept DLL multiple times, or to simply preserve log or report data from run-to-run.

LogToFile (bool)

If set to a nonzero value, sends log information to the file "clintercept_log.txt" instead of to stderr.

LogToDebugger (bool)

If set to a nonzero value, sends log information to the debugger instead of to stderr. If both LogToFile and LogToDebugger are nonzero then log information will be sent both to a file and to the debugger.

LogIndent (int)

Indents each log entry by this many spaces.

BuildLogging (bool)

If set to a nonzero value, logs the program build log after each call to clBuildProgram(). This will likely only function correctly for synchronous builds. Note that the build log is logged regardless of whether the program built successfully, which allows compiler warnings to be logged for successful compiles.

PreferredWorkGroupSizeMultipleLogging (bool)

If set to a nonzero value, logs the preferred work group size multiple for each kernel after each call to clCreateKernel(). On some devices this is the equivalent of the SIMD size for this kernel.

KernelInfoLogging (bool)

If set to a nonzero value, logs information about the kernel after each call to clCreateKernel().

CallLogging (bool)

If set to a nonzero value, logs function entry and exit information for every OpenCL call. This can be used to easily determine which OpenCL call is causing an application to crash or fail or if a crash occurs outside of an OpenCL call. This setting is best used with LogToFile or LogToDebugger as it can generate a lot of log data.

CallLoggingEnqueueCounter (bool)

If set to a nonzero value, logs the enqueue counter in addition to function entry and exit information for every OpenCL call. This can be used to determine appropriate limits for DumpBuffersMinEnqueue, DumpBuffersMaxEnqueue, DumpImagesMinEnqueue, or DumpBuffersMaxEnqueue. If CallLogging is disabled then this control will have no effect.

CallLoggingThreadId (bool)

If set to a nonzero value, logs the ID of the calling thread in addition to function entry and exit information for every OpenCL call. This can be helpful when debugging multi-threading issues.

CallLoggingThreadNumber (bool)

If set to a nonzero value, logs the symbolic number of the calling thread in addition to function entry and exit information for every OpenCL call. This can be helpful when debugging multi-threading issues.

CallLoggingElapsedTime (bool)

If set to a nonzero value, logs the elapsed time in microseconds in addition to function entry and exit information for every OpenCL call, starting from the time the intercept DLL is loaded.

ITTCallLogging (bool)

If set to a nonzero value, logs function entry and exit information for every OpenCL call using the ITT APIs. This feature will only function if the Intercept Layer for OpenCL Applications is built with ITT support.

ChromeTraceBufferSize (cl_uint)

If set to a nonzero value, buffers JSON records for Chrome Tracing in memory before writing to a file. The buffer will be flushed when it fills, upon application termination, and optionally on blocking OpenCL calls.

ChromeTraceBufferingBlockingCallFlush (bool)

If set to a nonzero value, flushes buffered JSON records for Chrome Tracing after blocking OpenCL calls.

ChromeCallLogging (bool)

If set to a nonzero value, logs function entry and exit information for every OpenCL call to a JSON file that may be used for Chrome Tracing.

ChromeFlowEvents (bool)

If set to a nonzero value, adds flow events between OpenCL calls and OpenCL commands in a JSON file that may be used for Chrome Tracing. Requires both ChromeCallLogging and ChromePerformanceTiming.

ErrorLogging (bool)

If set to a nonzero value, logs all OpenCL errors and the function name that caused the error.

ErrorAssert (bool)

If set to a nonzero value, breaks into the debugger when an OpenCL error occurs.

ContextCallbackLogging (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will install a callback for every context and log any calls to the context callback. The application's context callback, if any, will be invoked after the Intercept Layer for OpenCL Applications' context callback.

ContextHintLevel (cl_uint)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will attempt to create contexts with the CL_CONTEXT_SHOW_DIAGNOSTICS_INTEL property set to the specified value. If this property is specified by the application, the Intercept Layer for OpenCL Applications will overwrite it with the specified value, otherwise the property and the specified value will be added to the list of context creation properties. This functionality is only available for OpenCL implementations that support the cl_intel_driver_diagnostics extension. If this functionality is not available in the underlying OpenCL implementation, the unmodified list of context properties will be used to create the context instead. More information about this feature, including valid values and their meaning, can be found in the cl_intel_driver_diagnostics extension specification.

EventCallbackLogging (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will install its own callback for every event callback and log the call to the event callback. The application's event callback will be invoked after the Intercept Layer for OpenCL Applications' event callback.

QueueInfoLogging (bool)

If set to a nonzero value, logs information about a queue when it is created.

EventChecking (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will check and log any events in an event wait list that are invalid or in an error state. This can help to debug complex event dependency issues.

LeakChecking (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will check for leaks of various OpenCL objects, such as memory objects and events.

USMChecking (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will check for incorrect usage of Unified Shared Memory (USM) pointers.

CLInfoLogging (bool)

If set to a nonzero value, logs information about the platforms and devices in the system on the first call to clGetPlatformIDs().

FlushFiles (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will flush files after ever write. This slows down performance but can help to avoid truncated files if the Intercept Layer for OpenCL Applications does not exit cleanly.

DumpDir (string)

If set, the Intercept Layer for OpenCL Applications will emit logs and dumps to this directory instead of the default directory. The default log and dump directory is "%SYSTEMDRIVE%\Intel\CLIntercept_Dump\<Process Name>" on Windows and "~/CLIntercept_Dump/<Process Name>" on other operating systems. The log and dump directory must be writeable, otherwise the Intercept Layer for OpenCL Applications will not be able to create or modify log or dump files.

AppendPid (bool)

If set, the Intercept Layer for OpenCL Applications will append process ID to the log directory name.

UniqueFiles (bool)

If set, the Intercept Layer for OpenCL Applications will find a unique file name for logs and reports by appending a number to the file names, if needed.

KernelNameHashTracking (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will append the program and build option hashes to the kernel name in logs and reports.

LongKernelNameCutoff (cl_uint)

If an OpenCL application uses kernels with very long names, the Intercept Layer for OpenCL Applications can substitute a "short" kernel identifier for a "long" kernel name in logs and reports. This control defines how long a kernel name must be (in characters) before it is replaced by a "short" kernel identifier.

DemangleKernelNames (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will track kernel names that are demangled according to C++ ABI rules. This setting requires compiler support for demangling and may not be available in all configurations.

Reporting Controls

ReportToStderr (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will emit reports to stderr.

ReportToFile (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will write results to the file "clintercept_report.txt".

ReportInterval (cl_uint)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will generate a report at regular intervals (based on the enqueue counter). This can be useful to generate report data while a long-running application is executing, or if an application does not exit cleanly.

Performance Timing Controls

HostPerformanceTiming (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will track the minimum, maximum, and average host CPU time for each OpenCL entry point. When the process exits, this information will be included in the file "clIntercept_report.txt".

ToolOverheadTiming (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will include some types of tool overhead in timing reports and some types of logging.

DevicePerformanceTiming (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will add event profiling to track the minimum, maximum, and average device time for each OpenCL command. This operation may be fairly intrusive and may have side effects; in particular it forces all command queues to be created with PROFILING_ENABLED and may increment the reference count for application events. When the process exits, this information will be included in the file "clIntercept_report.txt".

DevicePerformanceTimeKernelInfoTracking (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will distinguish between OpenCL NDRange kernels using information such as the kernel's Preferred Work Group Size Multiple (AKA SIMD size).

DevicePerformanceTimeGWOTracking (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will distinguish between OpenCL NDRange kernels with different global work offsets for the purpose of device performance timing.

DevicePerformanceTimeGWSTracking (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will distinguish between OpenCL NDRange kernels with different global work sizes for the purpose of device performance timing.

DevicePerformanceTimeLWSTracking (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will distinguish between OpenCL NDRange kernels with different local work sizes for the purpose of device performance timing.

DevicePerformanceTimeSuggestedLWSTracking (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will attempt to query and track the suggested local work size when the passed-in local work size is NULL.

DevicePerformanceTimeTransferTracking (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will distinguish between transfer operations of different sizes for the purpose of device performance timing.

DevicePerformanceTimingSkipUnmap (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will skip device performance timing for unmap operations. This is a workaround for a bug in some OpenCL implementations, where querying events created from unmap operations results in driver crashes.

HostPerformanceTimingMinEnqueue (cl_uint)

The Intercept Layer for OpenCL Applications will only collect host performance timing metrics when the enqueue counter is greater than this value, inclusive.

HostPerformanceTimingMaxEnqueue (cl_uint)

The Intercept Layer for OpenCL Applications will only collect host performance timing metrics when the enqueue counter is less than this value, inclusive.

DevicePerformanceTimingMinEnqueue (cl_uint)

The Intercept Layer for OpenCL Applications will only collect device performance timing metrics when the enqueue counter is greater than this value, inclusive.

DevicePerformanceTimingMaxEnqueue (cl_uint)

The Intercept Layer for OpenCL Applications will only collect device performance timing metrics when the enqueue counter is less than this value, inclusive.

HostPerformanceTimeLogging (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will log the host elapsed time for each OpenCL entry point. This can be useful to identify OpenCL entry points that execute significantly slower or faster than average on the host.

DevicePerformanceTimeLogging (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will log the device execution time deltas for each OpenCL command. This can be useful to identify specific OpenCL commands that execute significantly slower or faster than average on the device. If DevicePerformanceTiming is disabled then this control will have no effect.

DevicePerformanceTimelineLogging (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will log the device execution times for each OpenCL command. This can be useful to visualize the execution timeline of OpenCL commands that execute on the device. If DevicePerformanceTiming is disabled then this control will have no effect.

DevicePerfCounterLibName (string)

Full path to MDAPI shared library. If not set, the default MDAPI library will be used.

DevicePerfCounterEventBasedSampling (bool)

If set to a nonzero value and DevicePerfCounterCustom is set, the Intercept Layer for OpenCL Applications will enable Intel GPU Performance Counters to track the minimum, maximum, and average performance counter deltas for each OpenCL command. This operation may be fairly intrusive and may have side effects; in particular it forces all command queues to be created with PROFILING_ENABLED and may increment the reference count for application events. This feature will only function if the Intercept Layer for OpenCL Applications is built with MDAPI support.

DevicePerfCounterTimeBasedSampling (bool)

If set to a nonzero value and DevicePerfCounterCustom is set, the Intercept Layer for OpenCL Applications will enable Intel GPU Performance Counters to track performance counter deltas at regular time intervals. This operation may be fairly intrusive and may have side effects. This feature will only function if the Intercept Layer for OpenCL Applications is built with MDAPI support.

DevicePerfCounterCustom (string)

If set, the Intercept Layer for OpenCL Applications will collect MDAPI metrics for the Metric Set corresponding to this value for each OpenCL command. Frequently used Metric Sets include: ComputeBasic, ComputeExtended, L3_1, Sampler. The output file has the potential to be very big depending on the work load. This operation may be fairly intrusive and may have side effects; in particular it forces all command queues to be created with PROFILING_ENABLED and may increment the reference count for application events. When the process exits, this information will be included in the file "clintercept_perfcounter_dump_<Set Name>.txt". This feature will only function if the Intercept Layer for OpenCL Applications is built with MDAPI support.

DevicePerfCounterFile (string)

Full path to a custom MDAPI file. This can be used to add custom Metric Sets.

DevicePerfCounterTiming (bool)

If set to a nonzero value and DevicePerfCounterEventBasedSampling is set, the Intercept Layer for OpenCL Applications will report the average Intel GPU Performance Counters for each OpenCL command. When the process exits, this information will be included in the file "clIntercept_report.txt". This feature will only function if the Intercept Layer for OpenCL Applications is built with MDAPI support.

DevicePerfCounterReportMax (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will collect also max values of target platform to .csv with MDAPI counters as a column next to each metric.

DevicePerfCounterTimeBasedSamplingPeriod (uint32_t)

The sampling period for Intel GPU Performance Counter Time-based Sampling, in microseconds. A smaller sampling period increases overhead and the likelihood dropped samples but can be more precise. Note that some devices do not support very small sampling periods.

DevicePerfCounterTimeBasedBufferSize (uint32_t)

The buffer size for Intel GPU Performance Counter Time-based Sampling, in bytes. When set to zero, automatically chooses the device maximum buffer size. A larger buffer size will decrease the likelihood of dropped samples.

ITTPerformanceTiming (bool)

[Note: This control makes ITT calls, but they appear to do nothing!] If set to a nonzero value, the Intercept Layer for OpenCL Applications will generate ITT-compatible performance timing data. Similar to DevicePerformanceTiming, this operation may be fairly intrusive and may have side effects; in particular it forces all command queues to be created with PROFILING_ENABLED and may increment the reference count for application events. ITTPerformanceTiming will also silently create OpenCL command queues that support advanced performance counters if this functionality is available. This feature will only function if the Intercept Layer for OpenCL Applications is built with ITT support.

ITTShowOnlyExecutingEvents (bool)

[Note: This control makes ITT calls, but they appear to do nothing!] By default, when ITTPerformanceTiming is enabled, the Intercept Layer for OpenCL Applications will generate ITT-compatible information for all states of an OpenCL event: when the command was queued, when it was submitted, when it started executing, and when it finished executing. If ITTShowOnlyExecutingEvents is set to a nonzero value, the Intercept Layer for OpenCL Applications will only generate ITT-compatible instrumentation when an event begins executing and when an event ends executing. Since no information will be displayed about when a command is queued or submitted, this can sometimes make it easier to identify times when the device is idle. This feature will only function if the Intercept Layer for OpenCL Applications is built with ITT support.

ChromePerformanceTiming (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will generate device performance timing information in a JSON file that may be used for Chrome Tracing.

ChromePerformanceTimingInStages (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will separate the performance information placed in the JSON file into Queued, Submitted, and Execution stages. It will also reorder the threads/queues by starting runtime. This flag is only functional when ChromePerformanceTiming is also set.

ChromePerformanceTimingPerKernel (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will organize the performance information placed in the JSON file on a per kernel name basis. It is only functional when ChromePerformanceTiming is also set. When ChromePerformanceTimingInStages is also set, information about event stages will be retained.

ChromePerformanceTimingEstimateQueuedTime (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will unconditionally estimate the queued time for Chrome Tracing rather than computing it using device and host timers and event profiling data. The estimated time is less accurate than the computed time, but may be more reliable if the device and host timers or event profiling data is incorrect or imprecise.

Controls for Dumping and Injecting Programs and Build Options

OmitProgramNumber (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will omit the program number from dumped file names and hash tracking. This can produce deterministic results even if programs are built in a non-deterministic order (say, by multiple threads).

SimpleDumpProgramSource (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump the last string(s) passed to clCreateProgramWithSource() to the file kernel.cl, and the last program options passed to clBuildProgram() to the file kernel.txt. These files will be dumped to the application's working directory. If an application fails to compile a program and exits the program immediately after detecting a compile failure SimpleDumpProgram may be all that is needed to identify the program and program options that are failing to compile.

DumpProgramSourceScript (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump every string passed to clCreateProgramWithSource() to its own file. The directory names and file names for the dumped files match the directory names and file names expected by a modified OpenCL conformance test script to capture kernels. This setting overrides SimpleDumpProgramSource, and if it is set to a nonzero value then the value of SimpleDumpProgramSource is ignored.

DumpProgramSource (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump every string passed to clCreateProgramWithSource() to its own file. The file name will have the form "CLI_<Program Number>_<Unique Program Hash Code>_source.cl". Program options will be dumped to the same directory with the file name "CLI_<Program Number>_<Unique Program Hash Code>_<Compile Count>_<Unique Build Options Hash Code>_<API>_options.txt", where API is an empty string for clBuildProgram(), "compile" for clCompileProgram(), and "link" for clLinkProgram(). This setting can be used for information purposes to see all kernels that are used by an application or to dump programs for program injection. This setting overrides DumpProgramSourceScript and SimpleDumpProgramSource, and if it is set to a nozero value then the values of DumpProgramSourceScript and SimpleDumpProgramSource will be ignored.

DumpInputProgramBinaries (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump every program binary that is passed to clCreateProgramWithBinary() to its own file. The file name will have the form "CLI_<Program Number>_<Unique Program Hash Code>_<Device Type>.bin". This is the input program binary provided by the application, and not a device binary queried from the OpenCL implementation. In particular, note that it may be a SPIR 1.2 binary.

DumpProgramBinaries (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump every program binary that was successfully built with clBuildProgram() to its own file. The file name will have the form "CLI_<Program Number>_<Unique Program Hash Code>_<Compile Count>_<Unique Build Options Hash Code>_<Device Type>.bin". Program options will be dumped to the same directory with the file name "CLI_<Program Number>_<Unique Program Hash Code>_<Compile Count>_<Unique Build Options Hash Code>_<API>_options.txt", where API is an empty string for clBuildProgram(), "compile" for clCompileProgram(), and "link" for clLinkProgram(). This setting can be used to examine compiled program binaries or to dump program binaries for program binary injection. Note that this option dumps the output binary, which is a device binary, after calling clBuildProgram() or clLinkProgram().

DumpProgramSPIRV (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump every program IL binary passed to clCreateProgramWithIL() to its own file. The file name will have the form "CLI_<Program Number>_<Unique Program Hash Code>_0000.spv" - for now at least!. Program options will be dumped to the same directory with the file name "CLI_<Program Number>_<Unique Program Hash Code>_<Compile Count>_<Unique Build Options Hash Code>_<API>_options.txt", where <API> is an empty string for clBuildProgram(), "compile" for clCompileProgram(), and "link" for clLinkProgram(). This setting can be used for information purposes to see all kernels that are used by an application or to dump SPIRV programs for SPIRV injection.

InjectProgramSource (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will look to inject potentially modified kernel source to clCreateProgramWithSource() and/or potentially modified options to clCompileProgram() or clBuildProgram(). Note that program options currently cannot be injected for clLinkProgram().

InjectProgramBinaries (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will look to inject potentially modified kernel binaries via clCreateProgramWithBinary() in place of program text for each call to clCreateProgramWithSource(). This is typically done to reduce program compilation time or to use known good program binaries.

RejectProgramBinaries (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will reject kernel binaries passed via clCreateProgramWithBinary() and return CL_INVALID_BINARY. This can be used to force an application to re-compile program binaries from source.

InjectProgramSPIRV (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will look to inject potentially modified kernel SPIR-V binaries via clCreateProgramWithIL() in place of program text for each call to clCreateProgramWithSource().

PrependProgramSource (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will look to prepend kernel code from a file to the application provided kernel source passed to clCreateProgramWithSource(). The Intercept Layer for OpenCL Applications will look for kernel source to prepend in the dump and log directory. The files that are searched for are (in order) "CLI_<Program Number>_<Unique Program Hash Code>_prepend.cl", "CLI_<Unique Program Hash Code>_prepend.cl", and "CLI_prepend.cl".

AppendBuildOptions (string)

If set, the Intercept Layer for OpenCL Applications will add these build options to the end of any application provided or injected build options for each call to clCompileProgram or clBuildProgram().

AppendLinkOptions (string)

If set, the Intercept Layer for OpenCL Applications will add these build options to the end of any application provided or injected build options for each call to clLinkProgram().

DumpProgramBuildLogs (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump build logs for every device a program is built for to a separate file. The file name will have the form "CLI_<Program Number>_<Unique Program Hash Code>_<Compile Count>_<Unique Build Options Hash Code>_<Device Type>_build_log.txt".

DumpKernelISABinaries (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump kernel ISA binaries for every kernel, if supported. Currently, kernel ISA binaries are only supported for Intel GPU devices. Kernel ISA binaries can be decoded into ISA text with a disassembler. The file name will have the form "CLI_<Program Number>_<Unique Program Hash Code>_<Compile Count>_<Unique Build Options Hash Code>_<Device Type>_<Kernel Name>.isabin".

Controls for Emulating Features

Emulate_cl_khr_extended_versioning (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will emulate support for the cl_khr_extended_versioning extension.

Emulate_cl_khr_semaphore (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will emulate support for the cl_khr_semaphore extension.

Emulate_cl_intel_unified_shared_memory (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will emulate support for the cl_intel_unified_shared_memory extension USM APIs using SVM APIs. This can be useful to test USM applications on an implementation that supports SVM, but not USM.

Controls for Automatically Creating SPIR-V Modules

AutoCreateSPIRV (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will automatically create SPIR-V modules by invoking CLANG each time a program is built. The file name will have the form "CLI_<Program Number>_<Unique Program Hash Code>_<Compile Count>_<Unique Build Options Hash Code>.spv". Because invoking CLANG requires a file containing the OpenCL C source, setting this option implicitly sets DumpProgramSource as well. Additionally, this feature is not available for injected program source.

SPIRVClang (string)

The clang executable used to compile an OpenCL C program to a SPIR-V module. This can be an executable in the system path, a relative path, or a full absolute path.

SPIRVCLHeader (string)

The OpenCL header file used to compile an OpenCL C program to a SPIR-V module. This must be a relative path or a full absolute path.

SPIRVDis (string)

The spirv-dis executable used to optionally disassemble the compiled SPIR-V module to a SPIR-V text representation. This can be an executable in the system path, a relative path, or a full absolute path.

DefaultOptions (string)

This is the list of options that is implicitly passed to CLANG to build a non-OpenCL 2.0 SPIR-V module. Any application-provided build options will be appended to these build options.

OpenCL2Options (string)

This is the list of options that is implicitly passed to CLANG to build an OpenCL 2.0 SPIR-V module. Any application-provided build options will be appended to these build options.

Controls for Dumping and Injecting Buffers and Images

DumpBufferHashes (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump hashes of a buffer, SVM, or USM allocation rather than the full contents of the buffer. This can be useful to identify which kernel enqueues generate different results without requiring a large amount of disk space.

DumpImageHashes (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump hashes of an image rather than the full contents of the image. This can be useful to identify which kernel enqueues generate different results without requiring a large amount of disk space.

DumpArgumentsOnSet (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump the argument value on calls to clSetKernelArg(). Arguments are dumped as raw binary data. The file names will have the form "SetKernelArg_<Enqueue Number>_Kernel_<Kernel Name>_Arg_<Argument Number>.bin".

DumpBuffersAfterCreate (bool)

If set, the Intercept Layer for OpenCL Applications will dump buffers to a file after creation. This control still honors the enqueue counter limits, even though no enqueues are involved during buffer creation. Currently only works for cl_mem buffers created from host pointers.

DumpBuffersAfterMap (bool)

If set, the Intercept Layer for OpenCL Applications will dump the contents of a buffer to a file after the buffer is mapped. Only valid if the buffer is NOT mapped with CL_MAP_WRITE_INVALIDATE_REGION. If the buffer was mapped non-blocking, this may insert a clFinish() into the command queue, which may have functional or performance implications.

DumpBuffersBeforeUnmap (bool)

If set, the Intercept Layer for OpenCL Applications will dump the contents of a buffer to a file immediately before the buffer is unmapped. This is done by inserting a blocking clEnqueueMapBuffer() (and matching clEnqueueUnmapMemObject()) into the command queue, which may have functional or performance implications.

DumpBuffersBeforeEnqueue (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump buffer, SVM, and USM kernel arguments before calls to clEnqueueNDRangeKernel(). Only buffers that are kernel arguments for the kernel being enqueued are dumped. Buffers are dumped as raw binary data to a "memDumpPreEnqueue" subdirectory of the dump directory. The file names will have the form "Enqueue_<Enqueue Number>_Kernel_<Kernel Name>_Arg_<Argument Number>_Buffer_<Unique Memory Object Number>.bin".

DumpBuffersAfterEnqueue (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump buffer, SVM, and USM kernel arguments after calls to clEnqueueNDRangeKernel(). Only buffers that are kernel arguments for the kernel being enqueued are dumped. Buffers are dumped as raw binary data to a "memDumpPostEnqueue" subdirectory of the dump directory. The file names will have the form "Enqueue_<Enqueue Number>_Kernel_<Kernel Name>_Arg_<Argument Number>_Buffer_<Unique Memory Object Number>.bin". Note that this is the same naming convention as with DumpBuffersBeforeEnqueue, so the changes resulting from an enqueue can be determined by diff'ing the preEnqueue folder with the postEnqueue folder.

DumpBuffersForKernel (string)

If set, the Intercept Layer for OpenCL Applications will only dump buffer, SVM, and USM kernel arguments when the specified kernel is enqueued. This control is ignored unless DumpBuffersBeforeEnqueue or DumpBuffersAfterEnqueue are enabled.

DumpImagesBeforeEnqueue (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump image kernel arguments before calls to clEnqueueNDRangeKernel(). Only images that are kernel arguments for the kernel being enqueued are dumped. Images are dumped as raw binary data to a "memDumpPreEnqueue" subdirectory of the dump directory. The file names will have the form "Enqueue_<Enqueue Number>_Kernel_<Kernel Name>_Arg_<Argument Number>_Image_<Unique Memory Object Number>_<Width>x<Height>x<Depth>_<Element Size>bpp.raw".

DumpImagesAfterEnqueue (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump image kernel arguments after calls to clEnqueueNDRangeKernel(). Only images that are kernel arguments for the kernel being enqueued are dumped. Images are dumped as raw binary data to a "memDumpPostEnqueue" subdirectory of the dump directory. The file names will have the form "Enqueue_<Enqueue Number>_Kernel_<Kernel Name>_Arg_<Argument Number>_Image_<Unique Memory Object Number>_<Width>x<Height>x<Depth>_<Element Size>bpp.raw". Note that this is the same naming convention as with DumpImagesBeforeEnqueue, so the changes resulting from an enqueue can be determined by diff'ing the preEnqueue folder with the postEnqueue folder.

DumpImagesForKernel (string)

If set, the Intercept Layer for OpenCL Applications will only dump image kernel arguments when the specified kernel is enqueued. This control is ignored unless DumpImagesBeforeEnqueue or DumpImagesAfterEnqueue are enabled.

DumpBuffersMinEnqueue (cl_uint)

The Intercept Layer for OpenCL Applications will only dump buffer, SVM, and USM kernel arguments when the enqueue counter is greater than this value, inclusive.

DumpBuffersMaxEnqueue (cl_uint)

The Intercept Layer for OpenCL Applications will only dump buffer, SVM, and USM kernel arguments when the enqueue counter is less than this value, inclusive.

DumpImagesMinEnqueue (cl_uint)

The Intercept Layer for OpenCL Applications will only dump image kernel arguments when the enqueue counter is greater than this value, inclusive.

DumpImagesMaxEnqueue (cl_uint)

The Intercept Layer for OpenCL Applications will only dump image kernel arguments when the enqueue counter is less than this value, inclusive.

DumpArgumentsOnSetMinEnqueue (cl_uint)

The Intercept Layer for OpenCL Applications will only dump argument values when the enqueue counter is greater than this value, inclusive.

DumpArgumentsOnSetMaxEnqueue (cl_uint)

The Intercept Layer for OpenCL Applications will only dump kernel arguments when the enqueue counter is less than this value, inclusive.

InjectBuffers (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will look to inject potentially modified buffer, SVM, and USM contents before calls to clEnqueueNDRangeKernel(). Only buffers that are kernel arguments for the kernel being enqueued may be injected. The file name to inject will have the form "Enqueue_<Enqueue Number>_Kernel_<Kernel Name>_Arg_<Argument Number>_Buffer_<Unique Memory Object Number>.bin", which matches the file name for dumped buffers.

InjectImages (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will look to inject potentially modified image contents before calls to clEnqueueNDRangeKernel(). Only images that are kernel arguments for the kernel being enqueued may be injected. The file name to inject will have the form "Enqueue_<Enqueue Number>_Kernel_<Kernel Name>_Arg_<Argument Number>_Image_<Unique Memory Object Number>_<Width>x<Height>x<Depth>_<Element Size>bpp.raw", which matches the file name for dumped images.

Device Partitioning Controls

AutoPartitionAllDevices (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will automatically partition parent devices and return all parent devices and all sub-devices.

AutoPartitionAllSubDevices (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will automatically partition parent devices and return all sub-devices, but no parent devices.

AutoPartitionSingleSubDevice (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will automatically partition parent devices and return a single sub-device, but no other sub-devices or parent devices or other sub-devices.

AutoPartitionByAffinityDomain (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will try to automatically partition parent devices by the next partitionable affinity domain.

AutoPartitionEqually (cl_uint)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will try to automatically partition parent devices into sub-devices with the specified number of compute units.

Capture and Replay Controls

CaptureReplay (bool)

This is the top-level control for kernel capture and replay.

CaptureReplayMinEnqueue (cl_uint)

The Intercept Layer for OpenCL Applications will only enable kernel capture and replay when the enqueue counter is greater than this value, inclusive.

CaptureReplayMaxEnqueue (cl_uint)

The Intercept Layer for OpenCL Applications will stop kernel capture and replay when the encounter is greater than this value, meaning that only enqueues less than this value, inclusive, will be captured.

CaptureReplayKernelName (string)

If set, the Intercept Layer for OpenCL Applications will only enable kernel capture and replay when the kernel name equals this name.

CaptureReplayUniqueKernels (bool)

If set, the Intercept Layer for OpenCL Applications will only enable kernel capture and replay if the kernel signature (i.e. hash + kernelname) has not been seen already.

CaptureReplayNumKernelEnqueuesSkip (cl_uint)

The Intercept Layer for OpenCL Applications will skip this many kernel enqueues before enabling kernel capture and replay.

CaptureReplayNumKernelEnqueuesCapture (cl_uint)

The Intercept Layer for OpenCL Applications will only capture this many kernel enqueues.

AubCapture Controls

AubCapture (bool)

This is the top-level control for aub capture. The Intercept Layer for OpenCL Applications doesn't implement aub capture itself, but can be used to selectively enable and disable aub capture via other methods.

AubCaptureKDC (bool)

If set, the Intercept Layer for OpenCL Applications will use the older kdc.exe method of aub capture. By default, the newer NEO method of aub capture will be used. This control is ignored for all non-Windows operating systems.

AubCaptureIndividualEnqueues (bool)

If set, the Intercept Layer for OpenCL Applications will start aub capture before a kernel enqueue, and will also stop aub capture immediately after the kernel enqueue. Each file will have the form "AubCapture_Enqueue_<Enqueue Number>_kernel_<Kernel Name>". Note that non-kernel enqueues such as calls to clEnqueueReadBuffer() and clEnqueueWriteBuffer() will NOT be aub captured when this control is set. The AubCaptureMinEnqueue and AubCaptureMaxEnqueue controls are still honored when AubCaptureIndividualEnqueues is set.

AubCaptureMinEnqueue (cl_uint)

The Intercept Layer for OpenCL Applications will only enable aub capture when the enqueue counter is greater than this value, inclusive.

AubCaptureMaxEnqueue (cl_uint)

The Intercept Layer for OpenCL Applications will stop aub capture when the encounter is greater than this value, meaning that only enqueues less than this value, inclusive, will be captured. If the enqueue counter never reaches this value, the Intercept Layer for OpenCL Applications will stop aub capture when the it is unloaded.

AubCaptureKernelName (string)

If set, the Intercept Layer for OpenCL Applications will only enable aub capture when the kernel name equals this name.

AubCaptureKernelGWS (string)

If set, the Intercept Layer for OpenCL Applications will only enable aub capture when the NDRange global work size matches this string. The string should have the form "XxYxZ". The wildcard "*" matches all global work sizes.

AubCaptureKernelLWS (string)

If set, the Intercept Layer for OpenCL Applications will only enable aub capture when the NDRange local work size matches this string. The string should have the form "XxYxZ". The wildcard "*" matches all local work sizes, and the string "NULL" matches a NULL local work size.

AubCaptureUniqueKernels (bool)

If set, the Intercept Layer for OpenCL Applications will only enable aub capture if the kernel signature (i.e. hash + kernelname + gws + lws) has not been seen already. The behavior of this control is well-defined when AubCaptureIndividualEnqueues is not set, but it doesn't make much sense without AubCaptureIndividualEnqueues.

AubCaptureNumKernelEnqueuesSkip (cl_uint)

The Intercept Layer for OpenCL Applications will skip this many kernel enqueues before enabling aub capture. The behavior of this control is well-defined when AubCaptureIndividualEnqueues is not set, but it doesn't make much sense without AubCaptureIndividualEnqueues.

AubCaptureNumKernelEnqueuesCapture (cl_uint)

The Intercept Layer for OpenCL Applications will only capture this many kernel enqueues. The behavior of this control is well-defined when AubCaptureIndividualEnqueues is not set, but it doesn't make much sense without AubCaptureIndividualEnqueues.

AubCaptureStartWait (cl_uint)

The Intercept Layer for OpenCL Applications will wait for this many milliseconds before beginning aub capture.

AubCaptureEndWait (cl_uint)

The Intercept Layer for OpenCL Applications will wait for this many milliseconds before ending aub capture.

Execution Controls

NoErrors (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will cause all OpenCL APIs to return a successful error status.

ExitOnEnqueueCount (uint64_t)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will exit the application when the enqueue counter reaches the specified value. This can be useful to debug sporadic issues by exiting an application immediately, without needing to wait for the application to exit normally.

NullContextCallback (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will force the context callback to be NULL. With both context callback logging and NULL context callback set, the context callback will still be logged, but any application context callback will not be called.

FinishAfterEnqueue (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications inserts a call to clFinish() after every enqueue. The command queue that the command was just enqueued to is passed to clFinish(). This can be used to debug possible timing or resource management issues and will likely impact performance.

FlushAfterEnqueue (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications inserts a call to clFlush() after every enqueue. The command queue that the command was just enqueued to is passed to clFlush(). This can also be used to debug possible timing or resource management issues and is slightly less obtrusive than FinishAfterEnqueue but still will likely impact performance. If both FinishAfterEnqueue and FlushAfterEnqueue are nonzero then the Intercept Layer for OpenCL Applications will only insert a call to clFinish() after every enqueue, because clFinish() implies clFlush().

FlushAfterEnqueueBarrier (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications inserts a call to clFlush() after every barrier enqueue. The command queue that the command was just enqueued to is passed to clFlush(). This has been useful to debug out-of-order queue issues.

InOrderQueue (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will force all queues to be created in-order. This can be used for performance analysis, but may lead to deadlocks in some cases.

NoProfilingQueue (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will force all queues to be created without event profiling support. This can be used for performance analysis, but may lead to errors if the application requires event profiling.

DummyOutOfOrderQueue (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will create and destroy a dummy out-of-order queue. This may be useful for performance analysis.

NullEnqueue (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will silently ignore any enqueue. This can be used for performance analysis, but will likely cause errors if the application relies on any sort of information from OpenCL events and should be used carefully.

NullLocalWorkSize (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will force the local work size argument to clEnqueueNDRangeKernel() to be NULL, which causes the OpenCL implementation to pick the local work size. Note that this control takes effect before NullLocalWorkSizeX / NullLocalWorkSizeY / NullLocalWorkSizeZ (see below), so enabling both controls will have the effect of forcing a specific local work size.

NullLocalWorkSizeX (size_t)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will set the local work size that will be used if an application passes NULL as the local work size to clEnqueueNDRangeKernel(). 1D dispatches will only look at NullLocalWorkSizeX, 2D dispatches will only look at NullLocalWorkSizeX and NullLocalWorkSizeY, while 3D dispatches will look at NullLocalWorkSizeX, NullLocalWorkSizeY, and NullLocalWorkSizeZ. If the specified values for NullLocalWorkSize do not evenly divide the global work size then the specified values of NullLocalWorkSize will not take effect.

NullLocalWorkSizeY (size_t)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will set the local work size that will be used if an application passes NULL as the local work size to clEnqueueNDRangeKernel(). 1D dispatches will only look at NullLocalWorkSizeX, 2D dispatches will only look at NullLocalWorkSizeX and NullLocalWorkSizeY, while 3D dispatches will look at NullLocalWorkSizeX, NullLocalWorkSizeY, and NullLocalWorkSizeZ. If the specified values for NullLocalWorkSize do not evenly divide the global work size then the specified values of NullLocalWorkSize will not take effect.

NullLocalWorkSizeZ (size_t)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will set the local work size that will be used if an application passes NULL as the local work size to clEnqueueNDRangeKernel(). 1D dispatches will only look at NullLocalWorkSizeX, 2D dispatches will only look at NullLocalWorkSizeX and NullLocalWorkSizeY, while 3D dispatches will look at NullLocalWorkSizeX, NullLocalWorkSizeY, and NullLocalWorkSizeZ. If the specified values for NullLocalWorkSize do not evenly divide the global work size then the specified values of NullLocalWorkSize will not take effect.

InitializeBuffers (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will initialize the contents of allocated buffers with zero. Only valid for non-COPY_HOST_PTR and non-USE_HOST_PTR allocations.

DefaultQueuePriorityHint (cl_uint)

If set to a nonzero value, and if no other priority hint is specified by the application, the Intercept Layer for OpencL Applications will attempt to create a command queue with this priority hint value. Note: HIGH priority is 1, MED priority is 2, and LOW priority is 4.

DefaultQueueThrottleHint (cl_uint)

If set to a nonzero value, and if no other throttle hint is specified by the application, the Intercept Layer for OpencL Applications will attempt to create a command queue with this throttle hint value. Note: HIGH throttle is 1, MED throttle is 2, and LOW throttle is 4.

RelaxAllocationLimits (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will attempt to relax allocation limits to enable allocations larger than CL_DEVICE_MAX_MEM_ALLOC_SIZE.

Platform and Device Query Overrides

PlatformName (string)

If set to a non-empty value, the clGetPlatformInfo() query for CL_PLATFORM_NAME will return this string instead of the true platform name.

PlatformVendor (string)

If set to a non-empty value, the clGetPlatformInfo() query for CL_PLATFORM_VENDOR will return this string instead of the true platform vendor.

PlatformProfile (string)

If set to a non-empty value, the clGetPlatformInfo() query for CL_PLATFORM_PROFILE will return this string instead of the true platform profile.

PlatformVersion (string)

If set to a non-empty string, the clGetPlatformInfo() query for CL_PLATFORM_VERSION will return this string instead of the true platform version.

DeviceTypeFilter (cl_uint)

Hides all device types that are not in the filter. Note: CL_DEVICE_TYPE_CPU = 2, CL_DEVICE_TYPE_GPU = 4, CL_DEVICE_TYPE_ACCELERATOR = 8, CL_DEVICE_TYPE_CUSTOM = 16.

DeviceType (cl_uint)

If set to a non-zero value, the clGetDeviceInfo() query for CL_DEVICE_TYPE will return this value instead of the true device type. In addition, calls to clGetDeviceIDs() for this device type will return all devices, not just devices of the requested type. This can be used to enumerate all devices (even CPUs) as GPUs, or vice versa.

DeviceName (string)

If set to a non-empty string, the clGetDeviceInfo() query for CL_DEVICE_NAME will return this value instead of the true device name.

DeviceVendor (string)

If set to a non-empty string, the clGetDeviceInfo() query for CL_DEVICE_VENDOR will return this value instead of the true device vendor.

DeviceProfile (string)

If set to a non-empty string, the clGetDeviceInfo() query for CL_DEVICE_PROFILE will return this value instead of the true device profile.

DeviceVersion (string)

If set to a non-empty string, the clGetDeviceInfo() query for CL_DEVICE_VERSION will return this value instead of the true device version.

DeviceCVersion (string)

If set to a non-empty string, the clGetDeviceInfo() query for CL_DEVICE_OPENCL_C_VERSION will return this value instead of the true device version.

DeviceExtensions (string)

If set to a non-empty string, the clGetDeviceInfo() query for CL_DEVICE_EXTENSIONS will return this value instead of the true device extensions string.

DeviceILVersion (string)

If set to a non-empty string, the clGetDeviceInfo() query for CL_DEVICE_IL_VERSION will return this value instead of the true device intermediate language versions.

DeviceVendorID (cl_uint)

If set to a non-zero value, the clGetDeviceInfo() query for CL_DEVICE_VENDOR will return this value instead of the true device vendor ID.

DeviceMaxComputeUnits (cl_uint)

If set to a non-zero value, the clGetDeviceInfo() query for CL_DEVICE_MAX_COMPUTE_UNITS will return this value instead of the true device max compute units.

DevicePreferredVectorWidthChar (cl_uint)

If set to a non-negative value, the clGetDeviceInfo() query for CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR will return this value instead of the true device preferred vector width.

DevicePreferredVectorWidthShort (cl_uint)

If set to a non-negative value, the clGetDeviceInfo() query for CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT will return this value instead of the true device preferred vector width.

DevicePreferredVectorWidthInt (cl_uint)

If set to a non-negative value, the clGetDeviceInfo() query for CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT will return this value instead of the true device preferred vector width.

DevicePreferredVectorWidthLong (cl_uint)

If set to a non-negative value, the clGetDeviceInfo() query for CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG will return this value instead of the true device preferred vector width.

DevicePreferredVectorWidthHalf (cl_uint)

If set to a non-negative value, the clGetDeviceInfo() query for CL_DEVICE_PREFERRED_VECTOR_WIDTH_HALF will return this value instead of the true device preferred vector width.

DevicePreferredVectorWidthFloat (cl_uint)

If set to a non-negative value, the clGetDeviceInfo() query for CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT will return this value instead of the true device preferred vector width.

DevicePreferredVectorWidthDouble (cl_uint)

If set to a non-negative value, the clGetDeviceInfo() query for CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE will return this value instead of the true device preferred vector width.

DriverVersion (string)

If set to a non-empty string, the clGetDeviceInfo() query for CL_DRIVER_VERSION will return this value instead of the true driver version.

Precompiled Kernel and Builtin Kernel Override Controls

ForceByteBufferOverrides (bool)

If set to a nonzero value, each of the buffer functions that are overridden (via one or more of the keys below) will use a byte-wise operation to read/write/copy the buffer (default behavior is to try to copy multiple bytes at a time, if possible). Note: Requires OpenCL 1.1 or the "byte addressable store" extension.

OverrideReadBuffer (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will use a kernel to implement clEnqueueReadBuffer() instead of the implementation's clEnqueueReadBuffer(). Note: Requires OpenCL 1.1 or the "byte addressable store" extension.

OverrideWriteBuffer (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will use a kernel to implement clEnqueueWriteBuffer() instead of the implementation's clEnqueueWriteBuffer(). Note: Requires OpenCL 1.1 or the "byte addressable store" extension.

OverrideCopyBuffer (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will use a kernel to implement clEnqueueCopyBuffer() instead of the implementation's clEnqueueCopyBuffer(). Note: Requires OpenCL 1.1 or the "byte addressable store" extension.

OverrideReadImage (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will use a kernel to implement clEnqueueReadImage() instead of the implementation's clEnqueueReadImage(). Only 2D images are currently supported.

OverrideWriteImage (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will use a kernel to implement clEnqueueWriteImage() instead of the implementation's clEnqueueWriteImage(). Only 2D images are currently supported.

OverrideCopyImage (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will use a kernel to implement clEnqueueCopyImage() instead of the implementation's clEnqueueCopyImage(). Only 2D images are currently supported.

OverrideBuiltinKernels (bool)

If set to a nonzero value, the Intercept Layer for OpenCL Applications will use its own version of the built-in OpenCL kernels that may be accessed via clCreateProgramWithBuiltInKernels(). At present, only the VME block_motion_estimate_intel kernel is implemented.


* Other names and brands may be claimed as the property of others.

Copyright (c) 2018-2024, Intel(R) Corporation