Skip to content

Commit

Permalink
Squashed commit of the following:
Browse files Browse the repository at this point in the history
commit fe18b4a
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Thu Aug 15 18:01:28 2024 +0800

    fix reviews

commit 74d30dc
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Wed Aug 14 14:38:57 2024 +0800

    fix comments

commit e264cc1
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Wed Aug 14 14:37:46 2024 +0800

    address comments

commit 3e3bd51
Merge: 864da64 e02d78b
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Tue Aug 13 00:46:39 2024 -0700

    Merge branch 'llvm' into review/yang/fix_dsan_destruction

commit e02d78b
Merge: e50a4dd c12957b
Author: Omar Ahmed <omar.ahmed@codeplay.com>
Date:   Fri Aug 9 15:41:55 2024 +0100

    Merge pull request oneapi-src#1933 from nrspruit/fix_driver_version_check

    [L0] Fix Driver Version check to use extension and tuple check

commit e50a4dd
Merge: 3c12bbc 6b373e3
Author: Omar Ahmed <omar.ahmed@codeplay.com>
Date:   Fri Aug 9 14:34:49 2024 +0100

    Merge pull request oneapi-src#1923 from sarnex/buildlog

    [L0] Return the build log on compilation failure

commit 3c12bbc
Merge: 83f7ad9 ac7eb17
Author: Omar Ahmed <omar.ahmed@codeplay.com>
Date:   Fri Aug 9 10:51:05 2024 +0100

    Merge pull request oneapi-src#1910 from Bensuo/sync_point

    [CUDA][HIP] Improve command-buffer sync points

commit 83f7ad9
Merge: ab9baf5 8fb6824
Author: Omar Ahmed <omar.ahmed@codeplay.com>
Date:   Thu Aug 8 11:11:13 2024 +0100

    Merge pull request oneapi-src#1860 from PietroGhg/pietro/fill

    [NATIVECPU] Fix pointer arithmetic in USMfill

commit ab9baf5
Merge: 1fef4e2 c571ec4
Author: Omar Ahmed <omar.ahmed@codeplay.com>
Date:   Thu Aug 8 11:09:15 2024 +0100

    Merge pull request oneapi-src#1911 from ProGTX/peter/xpti-static

    [CUDA] Don't import XPTI symbols in the plugin library

commit 1fef4e2
Merge: 2d3524e ca68aca
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Wed Aug 7 17:46:52 2024 +0200

    Merge pull request oneapi-src#1949 from pbalcer/ci-benches

    add info how to run benchmarks in CI

commit ca68aca
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Wed Aug 7 17:44:52 2024 +0200

    add info how to run benchmarks in CI

commit 2d3524e
Merge: 6b2e678 d6e93fa
Author: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
Date:   Wed Aug 7 14:23:09 2024 +0100

    Merge pull request oneapi-src#1930 from oneapi-src/benie/no-import-in-pragma-region

    Make pragma region names joined by _

commit 6b2e678
Merge: d8058ed 6e295e1
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Wed Aug 7 13:58:53 2024 +0200

    Merge pull request oneapi-src#1944 from ldorau/CI_Add_possibility_to_start_manually_the_Nightly_GHA_workflow

    [CI] Add possibility to start manually the Nightly GHA workflow

commit d8058ed
Merge: 1445b66 4e4b04c
Author: Omar Ahmed <omar.ahmed@codeplay.com>
Date:   Wed Aug 7 12:30:47 2024 +0100

    Merge pull request oneapi-src#1843 from AllanZyne/review/yang/invalid_arguments

    [DeviceSanitizer] Support check invalid kernel argument

commit 1445b66
Merge: a89657c 355c4c3
Author: Omar Ahmed <omar.ahmed@codeplay.com>
Date:   Wed Aug 7 12:25:52 2024 +0100

    Merge pull request oneapi-src#1850 from Bensuo/native_enqueue_cosmetic

    Cosmetic tweaks to native enqueue spec

commit a89657c
Merge: 2355a7d be7057c
Author: Omar Ahmed <omar.ahmed@codeplay.com>
Date:   Wed Aug 7 12:01:33 2024 +0100

    Merge pull request oneapi-src#1699 from PietroGhg/pietro/usm_fixes

    [NATIVECPU] Implement urUSMGetMemAllocInfo and aligned alloc

commit 2355a7d
Merge: 450be81 b112525
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Wed Aug 7 13:00:51 2024 +0200

    Merge pull request oneapi-src#1945 from pbalcer/suppress-failures

    Suppress e2e test failures in L0 and OpenCL

commit d6e93fa
Author: Kenneth Benzie (Benie) <kenneth.benzie@intel.com>
Date:   Mon Aug 5 08:20:53 2024 -0700

    Make pragma region names joined by _

    On Windows the region name `usm import release (experimental)` cause
    compile errors in certain situations which look like this:

    ```
    error C7586: a 'import' directive must end with a ';' on the same line
    ```

    This patch replaces spaces with `_` in the region names to avoid this
    compile error.

commit b112525
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Wed Aug 7 10:40:21 2024 +0200

    Suppress e2e test failures in L0 and OpenCL

commit 450be81
Merge: 7f65917 b33c0e7
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Wed Aug 7 09:49:18 2024 +0200

    Merge pull request oneapi-src#1943 from kbenzie/benie/fix-coverity-issues

    Fix various Coverity defects

commit 6e295e1
Author: Lukasz Dorau <lukasz.dorau@intel.com>
Date:   Wed Aug 7 09:02:09 2024 +0200

    [CI] Add possibility to start manually the Nightly GHA workflow

    Add possibility to start manually the Nightly GHA workflow
    in order to check it on demand.

    Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>

commit 864da64
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Tue Aug 6 21:48:22 2024 -0700

    fix test

commit b33c0e7
Author: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
Date:   Tue Aug 6 18:07:42 2024 +0100

    Coverity: Fix 14 instances of Resource leak

    Addresses the following defect CIDs; 1594026, 1594028, 1594029, 1594030,
    1594031, 1594032, 1594033, 1594034, 1594035, 1594036, 1594037, 1595372,
    1595373, and 1598546.

commit 9f2166c
Author: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
Date:   Tue Aug 6 17:49:38 2024 +0100

    Coverity: Fix 1598473 Resource leak

commit ac7eb17
Author: Ewan Crawford <ewan@codeplay.com>
Date:   Wed Jul 31 12:54:45 2024 +0100

    [CUDA][HIP] Improve command-buffer sync points

    Several improvements to sync-point implementation
    in HIP and CUDA command-buffer adapters with
    additional CTS coverage to back it up.

    * In the CUDA/HIP adapters we assume that there is always
      a return sync-point passed by the user. However, this is not
      required by the UR API, so we should check that
      the return value is non-null before dereferencing.
    * The Fill helper function is can implement as fill as several commands
      for certain pattern sizes, we were creating a sync point for every
      internal command. This is not required, these commands from a linear
      dependency chain, so only the leaf command is required to be a sync
      point for future commands to depend on.
    * Remove `shared_ptr` from `CUgraphNode` objects stored for sync-points.
      `CUgraphNode` is a pointer type, and is managed by the CUDA driver
      runtime rather than us.
    * Simplify handling of return results. We don't always use the helper
      macro for returning the `ur_result_t` value no a function call fail,
      and also often unnecessarily use a variable to store return code.
    * Use `hipMemcpyDefault` for USM memcopy
    * Remove error from prefetch & advise

commit 7f65917
Merge: d2ffcce 8de9747
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Tue Aug 6 17:57:35 2024 +0200

    Merge pull request oneapi-src#1941 from pbalcer/cuda-runner-timeout

    add 1 hour time limit for e2e tests

commit e150934
Author: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
Date:   Tue Aug 6 16:29:21 2024 +0100

    Coverity: Fix 1595225 Data race condition

commit d51935e
Author: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
Date:   Tue Aug 6 15:43:29 2024 +0100

    Coverity: Fix 1594597 Dereference after null check

commit 8de9747
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Tue Aug 6 15:51:03 2024 +0200

    add 1 hour time limit for e2e tests

commit 132349c
Author: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
Date:   Tue Aug 6 15:36:16 2024 +0100

    Coverity: Fix 1595785 Use of auto that causes a copy

commit 7a370a4
Author: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
Date:   Tue Aug 6 15:15:19 2024 +0100

    Coverity: Fix 1595594 Copy instead of move

commit ee749e4
Author: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
Date:   Tue Aug 6 14:46:03 2024 +0100

    Coverity: Fix 1595568, 1595570 Use of auto that causes a copy

    Use `const auto &` instead of `auto` in the mock parameter struct
    accesses.

commit d08fc6a
Author: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
Date:   Tue Aug 6 13:39:01 2024 +0100

    Coverity: Fix 1594027 Uncaught exception

    The `UR_CHECK_ERROR()` utility macro in the CUDA adapter calls the
    `checkErrorUR()` utility function, this throws a `ur_result_t` which was
    not being caught.

commit 669797f
Author: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
Date:   Tue Aug 6 12:55:26 2024 +0100

    Coverity: Fix 1574354 Uninitialized scalar field

    Always zero initialize the `ArrayDesc` data member of `SurfaceMem` in
    the CUDA adapter. Simplify other construction logic.

commit d2ffcce
Merge: 9024918 c5d8106
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Tue Aug 6 13:40:15 2024 +0200

    Merge pull request oneapi-src#1913 from igchor/separate_adapter

    [L0 v2] Make L0 v2 implementation a seperate adapter

commit 9024918
Merge: 2233030 b93ecbb
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Tue Aug 6 13:37:58 2024 +0200

    Merge pull request oneapi-src#1912 from igchor/latency_tracker_histogram_hdr

    [common] Histogram-based latency tracker

commit 2233030
Merge: 9deaabc b6454e4
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Tue Aug 6 13:36:16 2024 +0200

    Merge pull request oneapi-src#1932 from igchor/raii_l0

    [L0 v2] Add raii wrapper for L0 handles

commit 250f759
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Tue Aug 6 00:16:03 2024 -0700

    add mutex for adapter

commit fbecf2a
Merge: d67cfec c5d2175
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Tue Aug 6 00:02:09 2024 -0700

    Merge branch 'llvm' into review/yang/fix_dsan_destruction

commit d67cfec
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Mon Aug 5 23:52:19 2024 -0700

    update test

commit 982667e
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Mon Aug 5 19:24:03 2024 -0700

    fix repeat hold adapter handle

commit c12957b
Author: Neil R. Spruit <neil.r.spruit@intel.com>
Date:   Mon Aug 5 16:37:45 2024 -0700

    [L0] Fix Driver Version check to use extension and tuple check

    - Fixed the isDriverVersionNewerOrSimilar to use the new intel driver
      version string if it exists and use a tuple to compare the minimum and
    existing versions.
    - Moved version check within the platform handle.

    Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>

commit 9deaabc
Merge: 84f5e70 ca2916e
Author: Omar Ahmed <omar.ahmed@codeplay.com>
Date:   Mon Aug 5 21:02:44 2024 +0100

    Merge pull request oneapi-src#1929 from oneapi-src/revert-1880-l0-native-enqueue

    Revert "[L0] L0 impl for enqueue native command"

commit b6454e4
Author: Igor Chorazewicz <igor.chorazewicz@intel.com>
Date:   Thu Jul 11 19:50:50 2024 +0000

    [L0 v2] Add raii wrapper for L0 handles

    that encapsulate lifetime management logic (including
    support for ownZeHandle).

commit b93ecbb
Author: Igor Chorazewicz <igor.chorazewicz@intel.com>
Date:   Thu May 9 02:21:53 2024 +0000

    [common] add latency tracker based on hdr_histogram

    This tracker allows for tracking min,max,mean,stdev and arbitrary percentile values.

    Calling TRACK_SCOPE_LATENCY(name) registers a latency tracker for a given scope.
    All latency measurements are collected to a per-thread histogram instance.
    When the program exits, all per-thread histograms (for the same scope) are
    agregated into a single histogram and all statistics are printed.

commit c5d8106
Author: Igor Chorazewicz <igor.chorazewicz@intel.com>
Date:   Wed Jul 31 23:41:26 2024 +0000

    [L0 v2] Make L0 v2 implementation a seperate adapter

    Initially, L0 v2 adapter was supposed to reside in a separate
    namespace but be a part of legacy L0 adapter (with runtime option
    to switch between executing on legacy or v2). However, this
    turns out to require a lot of changes in the legacy code to
    allow for function dispatching to legacy/v2 implementations of
    queue, event, etc.

    This approach allows us to keep the implementations separate while
    still resuing files when appropriate (e.g. for adapter.cpp or
    platform.cpp).

commit 6b373e3
Author: Sarnie, Nick <nick.sarnie@intel.com>
Date:   Fri Aug 2 08:32:55 2024 -0700

    [L0] Return the build log on compilation failure

    Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>

commit ca2916e
Author: Omar Ahmed <omarpiratee2010@gmail.com>
Date:   Mon Aug 5 15:42:34 2024 +0100

    Revert "[L0] L0 impl for enqueue native command"

commit 84f5e70
Merge: b5cd44c 721d523
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Mon Aug 5 15:58:40 2024 +0200

    Merge pull request oneapi-src#1927 from pbalcer/fix-scorecard

    fix scorecard job

commit 721d523
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Mon Aug 5 15:54:46 2024 +0200

    fix scorecard job

    The scorecard action must run on the official GitHub-hosted
    ubuntu runners...

commit b5cd44c
Merge: a25fc21 a2e35c0
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Mon Aug 5 15:30:00 2024 +0200

    Merge pull request oneapi-src#1922 from lukaszstolarczuk/bump-umf

    Bump UMF version with latest fixes

commit a25fc21
Merge: 65b4922 ae594ba
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Mon Aug 5 15:29:52 2024 +0200

    Merge pull request oneapi-src#1926 from oneapi-src/benie/force-libstdc++

    Add option to force use of libstdc++ on Linux

commit c571ec4
Author: Peter Žužek <peter@codeplay.com>
Date:   Mon Aug 5 14:27:54 2024 +0100

    [CUDA] Don't import XPTI symbols in the plugin library

    The CUDA plugin builds an XPTI file directly.
    By default the symbol visibility in that XPTI file is presumed
    to import symbols, but there are no XPTI symbols being exported,
    since XPTI is not built as a separate library.

    This causes a compilation failure on Windows.
    The fix is to define `XPTI_STATIC_LIBRARY`,
    which changes the visibility of symbols -
    on Windows this means no longer using `dllimport`
    (and neither using `dllexport`).

commit 65b4922
Merge: 9b93cb1 bcda0f8
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Mon Aug 5 15:27:16 2024 +0200

    Merge pull request oneapi-src#1921 from pbalcer/switch-runners

    switch ubuntu runners to a shared pool

commit a2e35c0
Author: Łukasz Stolarczuk <lukasz.stolarczuk@intel.com>
Date:   Fri Aug 2 16:57:04 2024 +0200

    Bump UMF version with latest fixes

commit ae594ba
Author: Kenneth Benzie (Benie) <kenneth.benzie@intel.com>
Date:   Mon Aug 5 05:07:46 2024 -0700

    Add option to force use of libstdc++ on Linux

    The UR_FORCE_LIBSTDCXX option defaults to OFF can be used in situations
    where the build is configured to use libc++ but the libstdc++ ABI is
    required for stability reasons.

commit bcda0f8
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Fri Aug 2 12:40:51 2024 +0200

    switch ubuntu runners to a shared pool

commit 9b93cb1
Merge: 96ae6b3 d7ea11f
Author: Omar Ahmed <omar.ahmed@codeplay.com>
Date:   Fri Aug 2 22:15:59 2024 +0100

    Merge pull request oneapi-src#1812 from nrspruit/fix_l0_program

    Fix L0 Program CTS failures

commit 96ae6b3
Merge: 27135eb 3972690
Author: Omar Ahmed <omar.ahmed@codeplay.com>
Date:   Fri Aug 2 18:56:44 2024 +0100

    Merge pull request oneapi-src#1810 from nrspruit/fix_l0_kernel_cts

    [L0] Fix kernel error handling and enumeration checking

commit d7ea11f
Author: Neil R. Spruit <neil.r.spruit@intel.com>
Date:   Wed Jul 10 13:28:21 2024 -0700

    Fix return value for multi device

    Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>

commit 7436827
Author: Neil R. Spruit <neil.r.spruit@intel.com>
Date:   Tue Jul 9 17:59:36 2024 -0700

    Fix Native Device Init

    Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>

commit cd4b111
Author: Neil R. Spruit <neil.r.spruit@intel.com>
Date:   Tue Jul 9 17:40:27 2024 -0700

    Fix multi device module/kernel access

    Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>

commit fa3a6a9
Author: Neil R. Spruit <neil.r.spruit@intel.com>
Date:   Tue Jul 2 12:41:00 2024 -0700

    [L0] Fix Get info Binaries And source and handle/pointer checks

    Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>

commit 64ad451
Author: Neil R. Spruit <neil.r.spruit@intel.com>
Date:   Tue Jul 2 11:01:22 2024 -0700

    [L0] Fix program get info

    Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>

commit 3972690
Author: Neil R. Spruit <neil.r.spruit@intel.com>
Date:   Tue Jul 2 09:40:40 2024 -0700

    [L0] Fix kernel error handling and enumeration checking

    - Fixed kernel create to free memory and close with nullptr
    - Fixed argument index checking for kernels and argument size checks
    - UR_KERNEL_INFO_NUM_REGS to be reported same as UR_KERNEL_INFO_NUM_ARGS

    Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>

commit 27135eb
Merge: a69e1b5 bfc7536
Author: Omar Ahmed <omar.ahmed@codeplay.com>
Date:   Fri Aug 2 15:08:27 2024 +0100

    Merge pull request oneapi-src#1896 from omarahmed1111/change-opencl-sampler-info-size

    Map ur_bool_t to cl_bool in sampler getinfo for opencl adapter

commit a69e1b5
Merge: 6539561 b816700
Author: Omar Ahmed <omar.ahmed@codeplay.com>
Date:   Fri Aug 2 14:24:18 2024 +0100

    Merge pull request oneapi-src#1906 from nrspruit/flex_gpu_copy_engine

    [L0] Add check for Intel Flex/Arc for disabling use of copy engines.

commit 6539561
Merge: 90b381c d3faf1a
Author: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
Date:   Fri Aug 2 13:28:43 2024 +0100

    Merge pull request oneapi-src#1917 from oneapi-src/benie/mock-init-callbacks-earlier

    Initalize mock callbacks earlier

commit 90b381c
Merge: 4ae5a92 9b16bfc
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Fri Aug 2 14:18:33 2024 +0200

    Merge pull request oneapi-src#1797 from lukaszstolarczuk/update-badges

    Update badges (for active workflows) in README

commit 4ae5a92
Merge: 509035d 728fac6
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Fri Aug 2 12:48:55 2024 +0200

    Merge pull request oneapi-src#1918 from pbalcer/fix-pvc-feature

    update L0 e2e workflow

commit 728fac6
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Fri Jul 26 11:02:57 2024 +0200

    update L0 e2e workflow

    suppressing the latest failing tests

commit 5859e3c
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Fri Aug 2 01:40:59 2024 -0700

    fix crash

commit 509035d
Merge: c1d8162 cb5cb6e
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Fri Aug 2 09:09:30 2024 +0200

    Merge pull request oneapi-src#1883 from aarongreig/aaron/asanObjectLifetimeIssues

    Don't retain device handle references in sanitizer layer.

commit 56ed0b8
Merge: 9e6923f 3e762e0
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Thu Aug 1 23:30:05 2024 -0700

    Merge branch 'llvm' into review/yang/fix_dsan_destruction

commit cb5cb6e
Author: Aaron Greig <aaron.greig@codeplay.com>
Date:   Mon Jul 29 16:09:59 2024 +0100

    Add comment denoting change as a temporary fix.

commit 55539ac
Author: Aaron Greig <aaron.greig@codeplay.com>
Date:   Fri Jul 19 14:29:24 2024 +0100

    Don't retain device handle references in sanitizer layer.

commit c1d8162
Merge: 4f2ce7f 7ce7387
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Fri Aug 2 07:44:20 2024 +0200

    Merge pull request oneapi-src#1920 from zhaomaosu/devsan-add-missing-lib

    [DeviceSanitizer] Add missing required library

commit 7ce7387
Author: Maosu Zhao <maosu.zhao@intel.com>
Date:   Fri Aug 2 11:08:46 2024 +0800

    [DeviceSanitizer] Add missing required library

    Fix syclos post commit failure:
    https://github.com/intel/llvm/actions/runs/10196353773/job/28206962107

commit d3faf1a
Author: Kenneth Benzie (Benie) <kenneth.benzie@intel.com>
Date:   Thu Aug 1 04:35:19 2024 -0700

    Initalize mock callbacks earlier

    Avoid use after static destruction in sycl unittests by moving the
    initialization of `mock::callbacks` from static function scope to static
    global scope.

commit 4f2ce7f
Merge: 90180f4 ae03bf6
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Thu Aug 1 12:25:01 2024 +0200

    Merge pull request oneapi-src#1915 from bratpiorka/rrudnick_umf_rc3

    bump UMF tag to switch to rc3 release

commit ae03bf6
Author: Rafal Rudnicki <rafal.rudnicki@intel.com>
Date:   Thu Aug 1 10:25:15 2024 +0200

    bump UMF tag to switch to rc3 release

commit 90180f4
Merge: c5d2175 1ff321c
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Thu Aug 1 10:30:30 2024 +0200

    Merge pull request oneapi-src#1902 from pbalcer/benchmark-automation-2

    improve benchmarks automation

commit 4e4b04c
Merge: 7b04b92 bc1a28e
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Thu Aug 1 00:09:59 2024 -0700

    Merge branch 'llvm' into review/yang/invalid_arguments

commit 7b04b92
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Thu Aug 1 00:07:03 2024 -0700

    default enable

commit c5d2175
Merge: 99489ad c86beb6
Author: Omar Ahmed <omar.ahmed@codeplay.com>
Date:   Wed Jul 31 14:52:26 2024 +0100

    Merge pull request oneapi-src#1882 from przemektmalon/przemek/interop-map-memory

    [Bindless][Exp] Add interop memory mapping to USM.

commit 8fb6824
Merge: a4510ac 99489ad
Author: uwedolinsky <uwe@codeplay.com>
Date:   Wed Jul 31 13:27:42 2024 +0100

    Merge branch 'main' into pietro/fill

commit 99489ad
Merge: 3e762e0 3f13f69
Author: Omar Ahmed <omar.ahmed@codeplay.com>
Date:   Wed Jul 31 13:23:29 2024 +0100

    Merge pull request oneapi-src#1880 from hdelan/l0-native-enqueue

    [L0] L0 impl for enqueue native command

commit a4510ac
Merge: 385cd05 3e762e0
Author: Uwe Dolinsky <uwe@codeplay.com>
Date:   Wed Jul 31 12:46:38 2024 +0100

    Merge remote-tracking branch 'upstream/main' into pietro/fill

commit 3e762e0
Merge: c805a71 a2a053d
Author: Omar Ahmed <omar.ahmed@codeplay.com>
Date:   Wed Jul 31 12:26:34 2024 +0100

    Merge pull request oneapi-src#1884 from callumfare/callum/fix_printtrace

    Enable PrintTrace when SYCL UR tracing is enabled

commit 3f13f69
Merge: 716ee15 c805a71
Author: Hugh Delaney <hugh.delaney@codeplay.com>
Date:   Wed Jul 31 11:10:25 2024 +0100

    Merge branch 'main' into l0-native-enqueue

commit c805a71
Merge: 24d3e68 f566e5b
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Wed Jul 31 11:48:18 2024 +0200

    Merge pull request oneapi-src#1142 from lukaszstolarczuk/dockers-adapters

    Update and extend dockers

commit c86beb6
Author: Duncan Brawley <duncan.brawley@codeplay.com>
Date:   Tue Jul 30 15:44:27 2024 +0100

    Remove LegacyMessage and small formatting fix

commit b816700
Author: Neil R. Spruit <neil.r.spruit@intel.com>
Date:   Fri Jul 26 10:32:24 2024 -0700

    [L0] Add check for Intel Flex/Arc for disabling use of copy engines.

    Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>

commit bfc7536
Author: omarahmed1111 <omar.ahmed@codeplay.com>
Date:   Thu Jul 25 11:58:18 2024 +0100

    Map ur_bool_t to cl_bool in opencl sampler getinfo

commit 6935b17
Author: Duncan Brawley <duncan.brawley@codeplay.com>
Date:   Tue Jul 30 13:20:36 2024 +0100

    Remote 'interop' keyword

commit b9bd031
Merge: c3baef7 47ab963
Author: Duncan Brawley <duncan.brawley@codeplay.com>
Date:   Tue Jul 30 12:59:42 2024 +0100

    merge 'origin/sycl' into przemek/interop-map-memory

commit a2a053d
Author: Callum Fare <callum@codeplay.com>
Date:   Tue Jul 23 16:30:13 2024 +0100

    Enable PrintTrace when SYCL UR tracing is enabled

commit 716ee15
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Tue Jul 30 11:00:24 2024 +0200

    always execute the command list between ops in native enqueue

commit 1528f4c
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Tue Jul 30 10:42:58 2024 +0200

    fix ordering between operations in native enqueue

commit 1ff321c
Author: Piotr Balcer <piotr.balcer@intel.com>
Date:   Fri Jul 26 14:15:34 2024 +0200

    improve benchmarks automation

    This patch:
     - adds an option to run a benchmark a few times to pick a median value
     - adds a timeout for benchmarks, set at 10 minutes by default.
     - adds an option to filter out benchmarks by name
     - adds an option to pick a specific compiler commit to test with
     - adds more compute benchmarks
     - fixes cudaSift
     - uses upstream Velocity Bench
     - adds a simple summary table with results

commit 352015f
Author: Hugh Delaney <hugh.delaney@codeplay.com>
Date:   Mon Jul 29 14:36:11 2024 +0100

    Update comment

    Clarify wording in comment.

commit 071223f
Author: Hugh Delaney <hugh.delaney@codeplay.com>
Date:   Mon Jul 29 12:17:59 2024 +0100

    Add extra synchronization

    Enqueue things to L0 before calling queueFinish.

commit 38d10ec
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Thu Jul 25 20:10:59 2024 -0700

    argument index start from 1

commit 5e1195e
Author: Hugh Delaney <hugh.delaney@codeplay.com>
Date:   Thu Jul 25 15:05:14 2024 +0100

    Update source/adapters/level_zero/enqueue_native.cpp

    Co-authored-by: Piotr Balcer <piotr.balcer@intel.com>

commit 632ba6b
Author: Hugh Delaney <hugh.delaney@codeplay.com>
Date:   Thu Jul 25 13:57:48 2024 +0100

    Update matchfile

commit 5b12e29
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Wed Jul 24 20:05:23 2024 -0700

    change log message

commit ef0e07f
Merge: 1391baa e161516
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Wed Jul 24 19:59:51 2024 -0700

    Merge branch 'llvm' into review/yang/invalid_arguments

commit 6111fb2
Author: Hugh Delaney <hugh.delaney@codeplay.com>
Date:   Wed Jul 24 12:46:37 2024 +0100

    For out of order queues call queue finish

    We can't use normal synchronization for out of order queues, so use
    brute force queueFinish.

commit 382325d
Author: Hugh Delaney <hugh.delaney@codeplay.com>
Date:   Wed Jul 24 12:43:42 2024 +0100

    Remove comment

commit 245afb3
Author: Hugh Delaney <hugh.delaney@codeplay.com>
Date:   Wed Jul 24 12:33:29 2024 +0100

    Update source/adapters/level_zero/enqueue_native.cpp

    Co-authored-by: Piotr Balcer <piotr.balcer@intel.com>

commit 7fbc58b
Author: Hugh Delaney <hugh.delaney@codeplay.com>
Date:   Wed Jul 24 11:35:26 2024 +0100

    Remove lock

commit d76742e
Author: Hugh Delaney <hugh.delaney@codeplay.com>
Date:   Wed Jul 24 11:33:19 2024 +0100

    Use ScopedCommandList to get thread local CL

    Same as the CUDA implementation. This means that any CommandList
    obtained through urQueueGetNativeHandle will be the same CommmandList
    that is synchronized with before the interop func call.

commit 8020612
Author: Hugh Delaney <hugh.delaney@codeplay.com>
Date:   Tue Jul 23 11:02:37 2024 +0100

    Add match files

    Add empty match files for level_zero.

commit 7d14d84
Author: Hugh Delaney <hugh.delaney@codeplay.com>
Date:   Mon Jul 22 16:46:49 2024 +0100

    Update entry point

    Thanks pbalcer for suggestion.

commit f2afed2
Author: Hugh Delaney <hugh.delaney@codeplay.com>
Date:   Mon Jul 22 14:21:58 2024 +0100

    Try L0 impl for enqueue native command

    Draft impl for discussion.

commit f566e5b
Author: Łukasz Stolarczuk <lukasz.stolarczuk@intel.com>
Date:   Wed Jul 24 11:07:23 2024 +0200

    [CI] Add more docker recipes

    and update the existing ones.

commit 1391baa
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Tue Jul 23 20:27:03 2024 -0700

    default disable

commit 237a4af
Merge: 88f2156 f11caf9
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Tue Jul 23 20:24:40 2024 -0700

    Merge branch 'llvm' into review/yang/invalid_arguments

commit 9e6923f
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Tue Jul 23 19:54:50 2024 -0700

    wip

commit c3baef7
Author: Przemek Malon <przemek.malon@codeplay.com>
Date:   Fri May 31 16:42:51 2024 +0100

    [Bindless][Exp] Add interop memory mapping to USM.

    This patch introduces `urBindlessImagesMapExternalLinearMemoryExp` to
    allow mapping interop memory to USM regions.

commit ae7dea6
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Mon Jul 22 01:40:27 2024 -0700

    using unordered_set

commit 6449148
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Sun Jul 21 22:42:41 2024 -0700

    Add UR_CALL

commit df5fd8b
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Sun Jul 21 22:36:23 2024 -0700

    fix destruction

commit 88f2156
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Fri Jul 19 00:07:31 2024 -0700

    fix crash

commit 0a916a1
Merge: cc40e85 38a575b
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Thu Jul 18 22:27:47 2024 -0700

    Merge branch 'main' into review/yang/invalid_arguments

commit be7057c
Author: PietroGhg <pietro.ghiglio@codeplay.com>
Date:   Mon Jun 3 16:30:29 2024 +0100

    Use pointer metadata

commit be3ed4c
Author: PietroGhg <pietro.ghiglio@codeplay.com>
Date:   Wed May 29 08:28:39 2024 +0100

    Implement urUSMGetMemAllocInfo and aligned alloc

commit cc40e85
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Wed Jul 17 03:30:33 2024 -0700

    fix lit

commit 4949b1a
Merge: 70dc457 6c2329e
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Tue Jul 16 20:03:15 2024 -0700

    Merge branch 'main' into review/yang/invalid_arguments

commit 70dc457
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Tue Jul 16 05:02:29 2024 -0700

    fix build

commit d2e4949
Merge: 5ba3170 7e38af7
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Mon Jul 15 22:58:49 2024 -0700

    Merge branch 'main' into review/yang/invalid_arguments

commit 385cd05
Author: PietroGhg <pietro.ghiglio@codeplay.com>
Date:   Mon Jul 8 13:24:38 2024 +0100

    Fix pointer arithmetic in USMfill

commit 355c4c3
Author: Ewan Crawford <ewan@codeplay.com>
Date:   Wed Jul 10 16:03:47 2024 +0100

    Cosmetic tweaks to native enqueue spec

    Pedantic things I noticed while reading spec.

commit 5ba3170
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Wed Jul 10 01:08:35 2024 -0700

    fix build

commit ee2a5f1
Author: Zhao, Yang2 <yang2.zhao@intel.com>
Date:   Wed Jul 10 01:07:02 2024 -0700

    chack invalid arg in kernel

commit 9b16bfc
Author: Łukasz Stolarczuk <lukasz.stolarczuk@intel.com>
Date:   Thu Jun 27 16:44:41 2024 +0200

    Update badges (for active workflows) in README

    E2E workflows run now as part of "Build and test" workflow.
    Add missing other workflows, to track if they are green or not.
  • Loading branch information
AllanZyne committed Aug 26, 2024
1 parent d52c68d commit 9cc252c
Show file tree
Hide file tree
Showing 81 changed files with 3,133 additions and 266 deletions.
3 changes: 3 additions & 0 deletions .github/workflows/cmake.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ jobs:
compiler: [{c: gcc, cxx: g++}]
libbacktrace: ['-DVAL_USE_LIBBACKTRACE_BACKTRACE=OFF']
pool_tracking: ['-DUMF_ENABLE_POOL_TRACKING=ON', '-DUMF_ENABLE_POOL_TRACKING=OFF']
latency_tracking: ['-DUMF_ENABLE_LATENCY_TRACKING=OFF']
include:
- os: 'ubuntu-22.04'
build_type: Release
Expand Down Expand Up @@ -92,6 +93,7 @@ jobs:
-DUR_DPCXX=${{github.workspace}}/dpcpp_compiler/bin/clang++
${{matrix.libbacktrace}}
${{matrix.pool_tracking}}
${{matrix.latency_tracking}}
- name: Configure CMake
if: matrix.os == 'ubuntu-20.04'
Expand All @@ -106,6 +108,7 @@ jobs:
-DUR_FORMAT_CPP_STYLE=ON
${{matrix.libbacktrace}}
${{matrix.pool_tracking}}
${{matrix.latency_tracking}}
- name: Generate source from spec, check for uncommitted diff
if: matrix.os == 'ubuntu-22.04'
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/e2e_cuda.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,4 @@ jobs:
prefix: "ext_oneapi_"
config: "--cuda"
unit: "gpu"
extra_lit_flags: "-sv --max-time=3600"
2 changes: 2 additions & 0 deletions .github/workflows/e2e_opencl.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,5 @@ jobs:
prefix: ""
config: ""
unit: "cpu"
xfail: "AOT/double.cpp;AOT/half.cpp;AOT/reqd-sg-size.cpp;Basic/built-ins/marray_geometric.cpp;KernelCompiler/kernel_compiler_spirv.cpp;KernelCompiler/opencl_queries.cpp"
extra_lit_flags: "-sv --max-time=3600"
1 change: 1 addition & 0 deletions .github/workflows/nightly.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
name: Nightly

on:
workflow_dispatch:
schedule:
# Run every day at 23:00 UTC
- cron: '0 23 * * *'
Expand Down
1 change: 1 addition & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ option(UR_BUILD_ADAPTER_CUDA "Build the CUDA adapter" OFF)
option(UR_BUILD_ADAPTER_HIP "Build the HIP adapter" OFF)
option(UR_BUILD_ADAPTER_NATIVE_CPU "Build the Native-CPU adapter" OFF)
option(UR_BUILD_ADAPTER_ALL "Build all currently supported adapters" OFF)
option(UR_BUILD_ADAPTER_L0_V2 "Build the (experimental) Level-Zero v2 adapter" OFF)
option(UR_BUILD_EXAMPLE_CODEGEN "Build the codegen example." OFF)
option(VAL_USE_LIBBACKTRACE_BACKTRACE "enable libbacktrace validation backtrace for linux" OFF)
option(UR_ENABLE_ASSERTIONS "Enable assertions for all build types" OFF)
Expand Down
16 changes: 16 additions & 0 deletions scripts/benchmarks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,22 @@ This will download and build everything in `~/benchmarks_workdir/` using the com

The scripts will try to reuse the files stored in `~/benchmarks_workdir/`, but the benchmarks will be rebuilt every time. To avoid that, use `-no-rebuild` option.

## Running in CI

The benchmarks scripts are used in a GitHub Actions worflow, and can be automatically executed on a preconfigured system against any Pull Request.

![compute benchmarks](workflow.png "Compute Benchmarks CI job")

To execute the benchmarks in CI, navigate to the `Actions` tab and then go to the `Compute Benchmarks` action. Here, you will find a list of previous runs and a "Run workflow" button. Upon clicking the button, you will be prompted to fill in a form to customize your benchmark run. The only mandatory field is the `PR number`, which is the identifier for the Pull Request against which you want the benchmarks to run.

You can also include additional benchmark parameters, such as environment variables or filters. For a complete list of options, refer to `$ ./main.py --help`.

Once all the required information is entered, click the "Run workflow" button to initiate a new workflow run. This will execute the benchmarks and then post the results as a comment on the specified Pull Request.

By default, all benchmark runs are compared against `baseline`, which is a well-established set of the latest data.

You must be a member of the `oneapi-src` organization to access these features.

## Requirements

### Python
Expand Down
Binary file added scripts/benchmarks/workflow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
29 changes: 15 additions & 14 deletions scripts/core/EXP-NATIVE-ENQUEUE.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,13 +32,13 @@ within the native API through the function passed to
${x}EnqueueNativeCommandExp, the function argument must only use the native
queue accessed through ${x}QueueGetNativeHandle. Use of a native queue that is
not the native queue returned by ${x}QueueGetNativeHandle results in undefined
behaviour.
behavior.

Any args that are needed by the func must be passed through a void* and unpacked
within the func. If ${x}_mem_handle_t arguments are to be used within
pfnNativeEnqueue, they must be accessed using ${x}MemGetNativeHandle.
${x}_mem_handle_t arguments must be packed in the void* argument that will be
used in pfnNativeEnqueue, as well as ${x}EnqueueNativeCommandExp's phMemList
Any args that are needed by the func must be passed through a ``void*`` and unpacked
within the func. If ``${x}_mem_handle_t`` arguments are to be used within
``pfnNativeEnqueue``, they must be accessed using ${x}MemGetNativeHandle.
``${x}_mem_handle_t`` arguments must be packed in the void* argument that will be
used in ``pfnNativeEnqueue``, as well as ${x}EnqueueNativeCommandExp's ``phMemList``
argument.

API
Expand All @@ -65,24 +65,25 @@ Functions
Changelog
--------------------------------------------------------------------------------

+-----------+-------------------------+
| Revision | Changes |
+===========+=========================+
| 1.0 | Initial Draft |
+-----------+-------------------------+
| 1.1 | Make `phEvent` optional |
+-----------+-------------------------+
+-----------+---------------------------+
| Revision | Changes |
+===========+===========================+
| 1.0 | Initial Draft |
+-----------+---------------------------+
| 1.1 | Make ``phEvent`` optional |
+-----------+---------------------------+


Support
--------------------------------------------------------------------------------

Adapters which support this experimental feature *must* return true for the new
`${X}_DEVICE_INFO_ENQUEUE_NATIVE_COMMAND_SUPPORT_EXP` device info query.
``${X}_DEVICE_INFO_ENQUEUE_NATIVE_COMMAND_SUPPORT_EXP`` device info query.


Contributors
--------------------------------------------------------------------------------

* Hugh Delaney `hugh.delaney@codeplay.com <hugh.delaney@codeplay.com>`_
* Kenneth Benzie (Benie) `k.benzie@codeplay.com <k.benzie@codeplay.com>`_
* Ewan Crawford `ewan@codeplay.com <ewan@codeplay.com>`_
2 changes: 1 addition & 1 deletion scripts/generate_code.py
Original file line number Diff line number Diff line change
Expand Up @@ -465,7 +465,7 @@ def generate_level_zero_queue_api(path, section, namespace, tags, version, specs

name = "queue_api"
filename = "queue_api.cpp"
layer_dstpath = os.path.join(path, "adapters/level_zero")
layer_dstpath = os.path.join(path, "adapters", "level_zero", "v2")
os.makedirs(layer_dstpath, exist_ok=True)
fout = os.path.join(layer_dstpath, filename)

Expand Down
85 changes: 78 additions & 7 deletions source/adapters/level_zero/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -113,10 +113,6 @@ add_ur_adapter(${TARGET_NAME}
${CMAKE_CURRENT_SOURCE_DIR}/queue_api.hpp
${CMAKE_CURRENT_SOURCE_DIR}/queue.hpp
${CMAKE_CURRENT_SOURCE_DIR}/sampler.hpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/queue_immediate_in_order.hpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/queue_factory.hpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/context.hpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/command_list_cache.hpp
${CMAKE_CURRENT_SOURCE_DIR}/ur_level_zero.cpp
${CMAKE_CURRENT_SOURCE_DIR}/common.cpp
${CMAKE_CURRENT_SOURCE_DIR}/context.cpp
Expand All @@ -136,9 +132,6 @@ add_ur_adapter(${TARGET_NAME}
${CMAKE_CURRENT_SOURCE_DIR}/sampler.cpp
${CMAKE_CURRENT_SOURCE_DIR}/image.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../../ur/ur.cpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/queue_immediate_in_order.cpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/context.cpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/command_list_cache.cpp
)

if(NOT WIN32)
Expand Down Expand Up @@ -175,3 +168,81 @@ target_include_directories(${TARGET_NAME} PRIVATE
"${CMAKE_CURRENT_SOURCE_DIR}/../../"
LevelZeroLoader-Headers
)

if(UR_BUILD_ADAPTER_L0_V2)
add_ur_adapter(ur_adapter_level_zero_v2
SHARED
# sources shared with legacy adapter
${CMAKE_CURRENT_SOURCE_DIR}/adapter.hpp
${CMAKE_CURRENT_SOURCE_DIR}/common.hpp
${CMAKE_CURRENT_SOURCE_DIR}/device.hpp
${CMAKE_CURRENT_SOURCE_DIR}/platform.hpp
${CMAKE_CURRENT_SOURCE_DIR}/adapter.cpp
${CMAKE_CURRENT_SOURCE_DIR}/common.cpp
${CMAKE_CURRENT_SOURCE_DIR}/device.cpp
${CMAKE_CURRENT_SOURCE_DIR}/ur_interface_loader.cpp
${CMAKE_CURRENT_SOURCE_DIR}/platform.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../../ur/ur.cpp
# v2-only sources
${CMAKE_CURRENT_SOURCE_DIR}/v2/command_list_cache.hpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/context.hpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/event_pool_cache.hpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/event_pool.hpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/event_provider_counter.hpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/event_provider_normal.hpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/event_provider.hpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/event.hpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/queue_api.hpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/queue_immediate_in_order.hpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/api.cpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/command_list_cache.cpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/context.cpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/event_pool_cache.cpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/event_pool.cpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/event_provider_counter.cpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/event_provider_normal.cpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/event.cpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/queue_api.cpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/queue_create.cpp
${CMAKE_CURRENT_SOURCE_DIR}/v2/queue_immediate_in_order.cpp
)

# api.cpp contains NOT_SUPPORTED functions-only
set_source_files_properties(${CMAKE_CURRENT_SOURCE_DIR}/v2/api.cpp
PROPERTIES APPEND_STRING PROPERTY COMPILE_FLAGS "-Wno-unused-parameter")

if(NOT WIN32)
target_sources(ur_adapter_level_zero_v2
PRIVATE
${CMAKE_CURRENT_SOURCE_DIR}/adapter_lib_init_linux.cpp
)
endif()

# TODO: fix level_zero adapter conversion warnings
target_compile_options(ur_adapter_level_zero_v2 PRIVATE
$<$<CXX_COMPILER_ID:MSVC>:/wd4805 /wd4244>
)

set_target_properties(ur_adapter_level_zero_v2 PROPERTIES
VERSION "${PROJECT_VERSION_MAJOR}.${PROJECT_VERSION_MINOR}.${PROJECT_VERSION_PATCH}"
SOVERSION "${PROJECT_VERSION_MAJOR}"
)

if (WIN32)
# 0x800: Search for the DLL only in the System32 folder
target_link_options(ur_adapter_level_zero_v2 PUBLIC /DEPENDENTLOADFLAG:0x800)
endif()

target_link_libraries(ur_adapter_level_zero_v2 PRIVATE
${PROJECT_NAME}::headers
${PROJECT_NAME}::common
${PROJECT_NAME}::umf
LevelZeroLoader
LevelZeroLoader-Headers
)

target_include_directories(ur_adapter_level_zero_v2 PRIVATE
"${CMAKE_CURRENT_SOURCE_DIR}/../.."
LevelZeroLoader-Headers
)
endif()
4 changes: 1 addition & 3 deletions source/adapters/level_zero/context.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,6 @@
#include "queue.hpp"
#include "ur_level_zero.hpp"

#include "v2/context.hpp"

UR_APIEXPORT ur_result_t UR_APICALL urContextCreate(
uint32_t DeviceCount, ///< [in] the number of devices given in phDevices
const ur_device_handle_t
Expand All @@ -38,7 +36,7 @@ UR_APIEXPORT ur_result_t UR_APICALL urContextCreate(
ZE2UR_CALL(zeContextCreate, (Platform->ZeDriver, &ContextDesc, &ZeContext));
try {
ur_context_handle_t_ *Context =
new v2::ur_context_handle_t_(ZeContext, DeviceCount, Devices, true);
new ur_context_handle_t_(ZeContext, DeviceCount, Devices, true);

Context->initialize();
*RetContext = reinterpret_cast<ur_context_handle_t>(Context);
Expand Down
8 changes: 0 additions & 8 deletions source/adapters/level_zero/queue.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,6 @@
#include "ur_util.hpp"
#include "ze_api.h"

#include "v2/queue_factory.hpp"

// Hard limit for the event completion batches.
static const uint64_t CompletionBatchesMax = [] {
// Default value chosen empirically to maximize the number of asynchronous
Expand Down Expand Up @@ -501,12 +499,6 @@ UR_APIEXPORT ur_result_t UR_APICALL urQueueCreate(

UR_ASSERT(Context->isValidDevice(Device), UR_RESULT_ERROR_INVALID_DEVICE);

// optimized path for immediate, in-order command lists
if (v2::shouldUseQueueV2(Device, Flags)) {
*Queue = v2::createQueue(Context, Device, Props);
return UR_RESULT_SUCCESS;
}

// Create placeholder queues in the compute queue group.
// Actual L0 queues will be created at first use.
std::vector<ze_command_queue_handle_t> ZeComputeCommandQueues(
Expand Down
9 changes: 5 additions & 4 deletions source/adapters/level_zero/v2/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,14 @@

This is the home directory for L0 v2 adapter sources. This is a redesigned version of the L0 adapter that focuses on maximizing the performance of each queue mode individually (immediate/batched, in-order/out-of-order).

L0 v2 adapter can be enabled by setting `UR_L0_USE_QUEUE_V2=1` env variable. If the variable is not set, legacy path will be used.
L0 v2 adapter can be enabled by setting passing `UR_BUILD_ADAPTER_L0_V2=1` option to cmake. When enabled, `libur_adapter_level_zero_v2.[so|dll]` will be created.

# Code structure

v2 adapter only rewrites certain functions (mostly urEnqueue* functions) while reusing the rest. `ur_queue_handle_t` has become an abstract class and each enqueue function a virtual function.
v2 adapters is is a standalone adapter but reuses some logic from the legacy L0 adapter implementation - most notably: adapter.cpp, platform.cpp, device.cpp

Legacy enqeue path is implemented in `ur_queue_handle_legacy_t` which inherits from `ur_queue_handle_t`. For new, optimized path, each queue mode will be implemented as a separate queue class (e.g. `v2::ur_queue_immediate_in_order_t`) inheriting from `ur_queue_handle_t`.
Each queue mode will be implemented as a separate queue class (e.g. `v2::ur_queue_immediate_in_order_t`) inheriting from `ur_queue_handle_t` which is an abstract class
in v2 adapter.

`ur_queue_handle_t` is auto-generated by `make generate-code` - for every API function that accepts `ur_queue_handle_t` as a first parameter, new pure virtual method is created. The API function is then
auto-implemented (see ../queue_api.cpp) by dispatching to that virtual method. Developer is only responsbile for implementing that virtual function for every queue base class.
auto-implemented (see ./queue_api.cpp) by dispatching to that virtual method. Developer is only responsbile for implementing that virtual function for every queue base class.
Loading

0 comments on commit 9cc252c

Please sign in to comment.