Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][Graph] 3D kernel update regression test #373

Closed
wants to merge 55 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
5b52793
[SYCL] Fix XPTI FW lib path for test-e2e on Win (#13986)
KseniyaTikhomirova Jun 4, 2024
18c4fb2
[SYCL] [NATIVECPU] Report correct memory order capabilities for Nativ…
PietroGhg Jun 4, 2024
dce651b
[SYCL][COMPAT] Migrate bug fixes & refactor of get_*version APIs (#14…
joeatodd Jun 5, 2024
20991b1
[UR] Bump CUDA tag to 0e38fda0 (#14030)
kbenzie Jun 5, 2024
2838f40
[SYCL][Graph][L0] Correctly report when device supports update (#13987)
EwanC Jun 5, 2024
493e78b
[SYCL][Graph] Fix PI Kernel leak in graph update (#14029)
EwanC Jun 5, 2024
0cec128
[UR] Bump HIP tag to 399430da (#14037)
kbenzie Jun 5, 2024
f665968
[CI][CUDA] Uplift docker to use cuda 12.5 image. (#14049)
JackAKirk Jun 5, 2024
643d21f
[ESIMD] Allow full autodeduction for block_load/block_store and slm_b…
fineg74 Jun 5, 2024
7ff1a29
[SYCL] Allow specifying -foffload-lto with the new offload driver and…
sarnex Jun 5, 2024
fbeb7d4
[GHA] Uplift Linux IGC Dev RT version to igc-dev-8b999ec (#13963)
bb-sycl Jun 5, 2024
aa92b24
[SYCL][E2E] Disable flaky profiling_queue.cpp test on CUDA (#14054)
sarnex Jun 5, 2024
25f8a7c
[test-e2e][cuda] Fully qualify `sycl::sub_group` to avoid namespace a…
JackAKirk Jun 5, 2024
2dc80c3
[ESIMD] Allow full autodeduction for slm_atomic_update and atomic_upd…
fineg74 Jun 5, 2024
c7d627f
[ESIMD] Allow full autodeduction for prefetch APIs accepting simd_vie…
fineg74 Jun 5, 2024
037c67f
[SYCL] Disable flaky test EnqueueNoMemObjTwoHostTasks on Windows (#14…
sarnex Jun 5, 2024
89225ce
[SYCL][Joint Matrix] Test combinations are queried Part 3 (#13991)
artemrad Jun 6, 2024
ad84669
[SYCL][Joint Matrix] Test combinations are queried Part 4 (#14019)
artemrad Jun 6, 2024
09c9384
[SYCL][Graph] Clarify graph enable_profiling property in finalize() (…
guoyejun Jun 6, 2024
97ed50d
[SYCL] Force-emit more member functions into device code (#13985)
AlexeySachkov Jun 6, 2024
71a5e37
[SYCL][COMPAT] Added filter_device and list_devices (#14016)
Alcpz Jun 6, 2024
0224335
[SYCL][COMPAT] Add wait_and_free plus rename async_free in syclcompat…
AidanBeltonS Jun 6, 2024
2de1435
[SYCL][TEST-E2E] Disallow `dep_events.cpp` test built for CUDA backen…
mmoadeli Jun 6, 2024
f4829ab
[CI] pre-commit/aws pointed back to old image. (#14074)
JackAKirk Jun 6, 2024
33ea75e
[SYCL][TEST-E2E] Refactor the test to address Windows not printing th…
mmoadeli Jun 6, 2024
a12de3b
[SYCL][TEST-E2E] Extend `sycl-ls-gpu-default.cpp` test to cover Inte…
mmoadeli Jun 6, 2024
463f00c
[SYCL][E2E] Refactor/fix bfloat16 test (#14062)
aelovikov-intel Jun 6, 2024
0f796bc
[SYCL][E2E] Disable NonUniformGroups/ballot_group_algorithms.cpp on C…
aelovikov-intel Jun 6, 2024
0cbc9a0
[SYCL][E2E] Remove warnings in Basic e2e tests (#13994)
ayylol Jun 6, 2024
b8693eb
[Doc] Add Mar'24 Release Notes (#13879)
uditagarwal97 Jun 6, 2024
d66106c
[ESIMD] Allow full autodeduction of template parameters for atomic_up…
fineg74 Jun 6, 2024
353cc51
[ESIMD] Allow full autodeduction of template parameters for atomic_up…
fineg74 Jun 6, 2024
f8552d4
[SYCL][E2E] Disable memory_management_test3.cpp on Gen12 linux (#14087)
sarnex Jun 6, 2024
d48c371
[SYCL][Matrix] Amend CODEOWNERS for check_device_code matrix tests (#…
ianayl Jun 6, 2024
e51a90a
[CI] Don't run E2E tests on self-hosted CUDA in Nightly (#14041)
aelovikov-intel Jun 6, 2024
b925bd8
[SYCL] Add `vec<bfloat16>` support to math builtins (#14002)
uditagarwal97 Jun 7, 2024
3d9ded6
[SYCL] Change check_device_code CUDA tests to use SYCL_EXTERNAL (#13943)
ianayl Jun 7, 2024
4c2cbc5
[SYCL][Bindless] Enable non-Vulkan tests on Windows (#14045)
ProGTX Jun 7, 2024
27bd7ae
[SYCL][HIP] Remove unsupported from O0 tests on AMD (#13967)
npmiller Jun 7, 2024
cad941f
[NFCI][SYCL] Move SYCL Module Splitting to library. Part 2 (#13282)
maksimsab Jun 7, 2024
572aa5c
[SYCL][COMPAT] Add math `extend_v*2` to SYCLCompat (#13953)
OuadiElfarouki Jun 7, 2024
990b1d1
[ESIMD]Replace use of vc intrinsic with spirv extension for rdtsc API…
fineg74 Jun 7, 2024
141d723
[SYCL][Docs] Move sycl_ext_oneapi_enqueue_functions to experimental (…
steffenlarsen Jun 7, 2024
cadd800
[SYCL][ESIMD][E2E] Fix bit shift vector test to not use c++20 (#14081)
sarnex Jun 7, 2024
6d591f1
[SYCL][ESIMD] Instruction count performance test (#14033)
jasonlizhengjian Jun 7, 2024
4e36825
[SYCL] Record aspect names when computing device requirements (#13974)
jzc Jun 7, 2024
25bfb0b
[SYCL][Doc] Extension spec for "work_group_memory" (#13725)
gmlueck Jun 7, 2024
4fcc744
[SYCL][E2E] Remove uses of OpenCL primitives in Basic/image e2e tests…
ayylol Jun 7, 2024
0e587ed
[SYCL] Fix post-commit issue with library dependencies (#14094)
AlexeySachkov Jun 7, 2024
4222b4c
[SYCL] Restrict `sycl::vec` and swizzle operations to types mentioned…
uditagarwal97 Jun 7, 2024
2e1f14a
[SYCL] Fix UB and alignment issues in the SYCL default sorter (#13975)
againull Jun 8, 2024
51a061d
[SYCL][E2E] Remove use of deprecated exceptions in USM e2e tests (#14…
ayylol Jun 8, 2024
d6a18e1
[SYCL][COMPAT] Fix memory_management_test3 (#14080)
AidanBeltonS Jun 10, 2024
a35f862
[SYCL][Graph] fix the address pointer in graph print (#13595)
guoyejun Jun 10, 2024
4e7f2dc
[SYCL][Graph] 3D kernel update regression test
EwanC May 30, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,7 @@ sycl/test-e2e/KernelFusion @intel/dpcpp-kernel-fusion-reviewers
sycl/include/sycl/ext/oneapi/matrix/ @intel/sycl-matrix-reviewers
sycl/test-e2e/Matrix @intel/sycl-matrix-reviewers
sycl/test/matrix @intel/sycl-matrix-reviewers
sycl/test/check_device_code/matrix @intel/sycl-matrix-reviewers

# Native CPU
llvm/**/*SYCLNativeCPU* @intel/dpcpp-nativecpu-pi-reviewers
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/sycl-linux-precommit-aws.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ jobs:
with:
name: CUDA E2E
runner: '["aws_cuda-${{ github.event.workflow_run.id }}-${{ github.event.workflow_run.run_attempt }}"]'
image: ghcr.io/intel/llvm/ubuntu2204_build:latest
image: ghcr.io/intel/llvm/ubuntu2204_build:latest-0300ac924620a51f76c4929794637b82790f12ab
image_options: -u 1001 --gpus all --cap-add SYS_ADMIN --env NVIDIA_DISABLE_REQUIRE=1
target_devices: ext_oneapi_cuda:gpu
# No idea why but that seems to work and be in sync with the main
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/sycl-linux-precommit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ jobs:
build_artifact_suffix: "default"
build_cache_suffix: "default"
changes: ${{ needs.detect_changes.outputs.filters }}
build_image: "ghcr.io/intel/llvm/ubuntu2204_build:latest-0300ac924620a51f76c4929794637b82790f12ab"

determine_arc_tests:
name: Decide which Arc tests to run
Expand Down Expand Up @@ -77,7 +78,7 @@ jobs:
include:
- name: AMD/HIP
runner: '["Linux", "amdgpu"]'
image: ghcr.io/intel/llvm/ubuntu2204_build:latest
image: ghcr.io/intel/llvm/ubuntu2204_build:latest-0300ac924620a51f76c4929794637b82790f12ab
image_options: -u 1001 --device=/dev/dri --device=/dev/kfd
target_devices: ext_oneapi_hip:gpu
- name: Intel
Expand Down
7 changes: 0 additions & 7 deletions .github/workflows/sycl-nightly.yml
Original file line number Diff line number Diff line change
Expand Up @@ -74,13 +74,6 @@ jobs:
target_devices: opencl:cpu
tests_selector: e2e

- name: Self-hosted CUDA
runner: '["Linux", "cuda"]'
image: ghcr.io/intel/llvm/ubuntu2204_build:latest
image_options: -u 1001 --gpus all --cap-add SYS_ADMIN
target_devices: ext_oneapi_cuda:gpu
tests_selector: e2e

- name: SYCL-CTS on OCL CPU
runner: '["Linux", "gen12"]'
image: ghcr.io/intel/llvm/ubuntu2204_intel_drivers:latest
Expand Down
8 changes: 4 additions & 4 deletions clang/lib/AST/ASTContext.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12035,10 +12035,10 @@ bool ASTContext::DeclMustBeEmitted(const Decl *D) {
// or `indirectly_callable' attribute must be emitted regardless of number
// of actual uses
if (LangOpts.SYCLIsDevice && isa<CXXMethodDecl>(D)) {
if (auto *A = D->getAttr<SYCLDeviceIndirectlyCallableAttr>())
return !A->isImplicit();
if (auto *A = D->getAttr<SYCLDeviceAttr>())
return !A->isImplicit();
if (D->hasAttr<SYCLDeviceIndirectlyCallableAttr>())
return true;
if (D->hasAttr<SYCLDeviceAttr>())
return true;
}

GVALinkage Linkage = GetGVALinkageForFunction(FD);
Expand Down
12 changes: 9 additions & 3 deletions clang/lib/Driver/ToolChains/Clang.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5843,10 +5843,15 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
CmdArgs.push_back("-emit-llvm-uselists");

if (IsUsingLTO) {
bool IsUsingOffloadNewDriver =
Args.hasFlag(options::OPT_offload_new_driver,
options::OPT_no_offload_new_driver, false);
bool IsSYCLLTOSupported = JA.isDeviceOffloading(Action::OFK_SYCL) &&
Triple.isSPIROrSPIRV() &&
IsUsingOffloadNewDriver;
if (IsDeviceOffloadAction && !JA.isDeviceOffloading(Action::OFK_OpenMP) &&
!Args.hasFlag(options::OPT_offload_new_driver,
options::OPT_no_offload_new_driver, false) &&
!Triple.isAMDGPU()) {
!IsUsingOffloadNewDriver && !Triple.isAMDGPU() &&
!IsSYCLLTOSupported) {
D.Diag(diag::err_drv_unsupported_opt_for_target)
<< Args.getLastArg(options::OPT_foffload_lto,
options::OPT_foffload_lto_EQ)
Expand Down Expand Up @@ -10423,6 +10428,7 @@ static void getOtherSPIRVTransOpts(Compilation &C,
",+SPV_INTEL_fpga_invocation_pipelining_attributes"
",+SPV_INTEL_fpga_latency_control"
",+SPV_INTEL_task_sequence"
",+SPV_KHR_shader_clock"
",+SPV_INTEL_bindless_images";
ExtArg = ExtArg + DefaultExtArg + INTELExtArg;
if (C.getDriver().IsFPGAHWMode())
Expand Down
47 changes: 47 additions & 0 deletions clang/test/CodeGenSYCL/force-emit-device-virtual-funcs.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
// RUN: %clang_cc1 -internal-isystem %S/Inputs -triple spir64-unknown-unknown -fsycl-is-device \
// RUN: -fsycl-allow-virtual-functions -emit-llvm %s -o %t.ll
// RUN: FileCheck %s --input-file=%t.ll --implicit-check-not _ZN7Derived3baz \
// RUN: --implicit-check-not _ZN4Base4baz --implicit-check-not _ZN4Base3foo
//
// Some SYCL properties may be turned into 'sycl_device' attribute implicitly
// and we would like to ensure that functions like this (at the moment those
// would be virtual member functions only) are forcefully emitted into device
// code.

class Base {
virtual void foo() {}

virtual void baz();

[[__sycl_detail__::add_ir_attributes_function("indirectly-callable", "a")]]
virtual void bar();
};

void Base::bar() {}

void Base::baz() {}

class Derived : public Base {
public:
[[__sycl_detail__::add_ir_attributes_function("indirectly-callable", "b")]]
void foo() override;

[[__sycl_detail__::add_ir_attributes_function("indirectly-callable", "c")]]
void bar() override final;

[[__sycl_detail__::add_ir_attributes_function("not-indirectly-callable", "c")]]
void baz() override final;
};

void Derived::foo() {}

void Derived::bar() {}

void Derived::baz() {}

// CHECK: define {{.*}}spir_func void @_ZN4Base3bar{{.*}} #[[#AttrA:]]
// CHECK: define {{.*}}spir_func void @_ZN7Derived3foo{{.*}} #[[#AttrB:]]
// CHECK: define {{.*}}spir_func void @_ZN7Derived3bar{{.*}} #[[#AttrC:]]
// CHECK: attributes #[[#AttrA]] = {{.*}} "indirectly-callable"="a"
// CHECK: attributes #[[#AttrB]] = {{.*}} "indirectly-callable"="b"
// CHECK: attributes #[[#AttrC]] = {{.*}} "indirectly-callable"="c"
9 changes: 9 additions & 0 deletions clang/test/Driver/sycl-lto.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
// Verify the usage of -foffload-lto with SYCL.

// Verify we error when using the old offload driver.
// RUN: not %clangxx -fsycl -foffload-lto=thin %s -### 2>&1 | FileCheck -check-prefix=CHECK_ERROR %s
// CHECK_ERROR: unsupported option '-foffload-lto=thin' for target 'spir64-unknown-unknown'

// Verify there's no error and we see the expected cc1 flags with the new offload driver.
// RUN: %clangxx -fsycl --offload-new-driver -foffload-lto=thin %s -### 2>&1 | FileCheck -check-prefix=CHECK_SUPPORTED %s
// CHECK_SUPPORTED: clang{{.*}} "-cc1" "-triple" "spir64-unknown-unknown" {{.*}} "-flto=thin" "-flto-unit"
2 changes: 1 addition & 1 deletion clang/test/Driver/sycl-spirv-ext.c
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@
// CHECK-DEFAULT-SAME:,+SPV_INTEL_fpga_invocation_pipelining_attributes
// CHECK-DEFAULT-SAME:,+SPV_INTEL_fpga_latency_control
// CHECK-DEFAULT-SAME:,+SPV_INTEL_task_sequence
// CHECK-DEFAULT-SAME:,+SPV_KHR_shader_clock
// CHECK-DEFAULT-SAME:,+SPV_INTEL_bindless_images
// CHECK-DEFAULT-SAME:,+SPV_INTEL_token_type
// CHECK-DEFAULT-SAME:,+SPV_INTEL_bfloat16_conversion
Expand Down Expand Up @@ -125,4 +126,3 @@
// CHECK-CPU-SAME:,+SPV_KHR_non_semantic_info
// CHECK-CPU-SAME:,+SPV_KHR_cooperative_matrix
// CHECK-CPU-SAME:,+SPV_INTEL_fp_max_error"

2 changes: 1 addition & 1 deletion devops/containers/ubuntu2204_build.Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM nvidia/cuda:12.1.0-devel-ubuntu22.04
FROM nvidia/cuda:12.5.0-devel-ubuntu22.04

ENV DEBIAN_FRONTEND=noninteractive

Expand Down
8 changes: 4 additions & 4 deletions devops/dependencies-igc-dev.json
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
{
"linux": {
"igc_dev": {
"github_tag": "igc-dev-4627f1f",
"version": "4627f1f",
"updated_at": "2024-05-26T23:48:05Z",
"url": "https://api.github.com/repos/intel/intel-graphics-compiler/actions/artifacts/1539236241/zip",
"github_tag": "igc-dev-8b999ec",
"version": "8b999ec",
"updated_at": "2024-05-30T02:09:07Z",
"url": "https://api.github.com/repos/intel/intel-graphics-compiler/actions/artifacts/1550749489/zip",
"root": "{DEPS_ROOT}/opencl/runtime/linux/oclgpu"
}
}
Expand Down
20 changes: 10 additions & 10 deletions libdevice/cmake/modules/SYCLLibdevice.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ function(add_devicelib_obj obj_filename)

set(devicelib-obj-file-new-offload ${obj_new_offload_binary_dir}/${obj_filename}.${new-offload-lib-suffix})
add_custom_command(OUTPUT ${devicelib-obj-file-new-offload}
COMMAND ${clang} -fsycl -c --offload-new-driver
COMMAND ${clang} -fsycl -c --offload-new-driver -foffload-lto=thin
${compile_opts} ${sycl_targets_opt} ${OBJ_EXTRA_ARGS}
${CMAKE_CURRENT_SOURCE_DIR}/${OBJ_SRC}
-o ${devicelib-obj-file-new-offload}
Expand Down Expand Up @@ -270,7 +270,7 @@ add_custom_command(OUTPUT ${obj_binary_dir}/libsycl-fallback-imf.${lib-suffix}
VERBATIM)

add_custom_command(OUTPUT ${obj_binary_dir}/libsycl-fallback-imf.${new-offload-lib-suffix}
COMMAND ${clang} -fsycl -c --offload-new-driver
COMMAND ${clang} -fsycl -c --offload-new-driver -foffload-lto=thin
${compile_opts} ${sycl_targets_opt}
${imf_fp32_fallback_src} -I ${CMAKE_CURRENT_SOURCE_DIR}/imf
-o ${obj_binary_dir}/libsycl-fallback-imf.${new-offload-lib-suffix}
Expand All @@ -286,7 +286,7 @@ add_custom_command(OUTPUT ${obj_binary_dir}/fallback-imf-fp32-host.${lib-suffix}
VERBATIM)

add_custom_command(OUTPUT ${obj_binary_dir}/fallback-imf-fp32-host.${new-offload-lib-suffix}
COMMAND ${clang} ${imf_host_cxx_flags} --offload-new-driver
COMMAND ${clang} ${imf_host_cxx_flags} --offload-new-driver -foffload-lto=thin
-I ${CMAKE_CURRENT_SOURCE_DIR}/imf
${imf_fp32_fallback_src}
-o ${obj_binary_dir}/fallback-imf-fp32-host.${new-offload-lib-suffix}
Expand Down Expand Up @@ -321,7 +321,7 @@ add_custom_command(OUTPUT ${obj_binary_dir}/libsycl-fallback-imf-fp64.${lib-suff

add_custom_command(OUTPUT ${obj_binary_dir}/libsycl-fallback-imf-fp64.${new-offload-lib-suffix}
COMMAND ${clang} -fsycl -c -I ${CMAKE_CURRENT_SOURCE_DIR}/imf
--offload-new-driver
--offload-new-driver -foffload-lto=thin
${compile_opts} ${sycl_targets_opt}
${imf_fp64_fallback_src}
-o ${obj_binary_dir}/libsycl-fallback-imf-fp64.${new-offload-lib-suffix}
Expand All @@ -337,7 +337,7 @@ add_custom_command(OUTPUT ${obj_binary_dir}/fallback-imf-fp64-host.${lib-suffix}
VERBATIM)

add_custom_command(OUTPUT ${obj_binary_dir}/fallback-imf-fp64-host.${new-offload-lib-suffix}
COMMAND ${clang} ${imf_host_cxx_flags} --offload-new-driver
COMMAND ${clang} ${imf_host_cxx_flags} --offload-new-driver -foffload-lto=thin
-I ${CMAKE_CURRENT_SOURCE_DIR}/imf
${imf_fp64_fallback_src}
-o ${obj_binary_dir}/fallback-imf-fp64-host.${new-offload-lib-suffix}
Expand Down Expand Up @@ -372,7 +372,7 @@ add_custom_command(OUTPUT ${obj_binary_dir}/libsycl-fallback-imf-bf16.${lib-suff

add_custom_command(OUTPUT ${obj_binary_dir}/libsycl-fallback-imf-bf16.${new-offload-lib-suffix}
COMMAND ${clang} -fsycl -c -I ${CMAKE_CURRENT_SOURCE_DIR}/imf
--offload-new-driver
--offload-new-driver -foffload-lto=thin
${compile_opts} ${sycl_targets_opt}
${imf_bf16_fallback_src}
-o ${obj_binary_dir}/libsycl-fallback-imf-bf16.${new-offload-lib-suffix}
Expand All @@ -388,7 +388,7 @@ add_custom_command(OUTPUT ${obj_binary_dir}/fallback-imf-bf16-host.${lib-suffix}
VERBATIM)

add_custom_command(OUTPUT ${obj_binary_dir}/fallback-imf-bf16-host.${new-offload-lib-suffix}
COMMAND ${clang} ${imf_host_cxx_flags} --offload-new-driver
COMMAND ${clang} ${imf_host_cxx_flags} --offload-new-driver -foffload-lto=thin
-I ${CMAKE_CURRENT_SOURCE_DIR}/imf
${imf_bf16_fallback_src}
-o ${obj_binary_dir}/fallback-imf-bf16-host.${new-offload-lib-suffix}
Expand Down Expand Up @@ -437,7 +437,7 @@ add_custom_command(OUTPUT ${obj_binary_dir}/imf-fp32-host.${lib-suffix}
VERBATIM)

add_custom_command(OUTPUT ${obj_binary_dir}/imf-fp32-host.${new-offload-lib-suffix}
COMMAND ${clang} ${imf_host_cxx_flags} --offload-new-driver
COMMAND ${clang} ${imf_host_cxx_flags} --offload-new-driver -foffload-lto=thin
${CMAKE_CURRENT_SOURCE_DIR}/imf_wrapper.cpp
-o ${obj_binary_dir}/imf-fp32-host.${new-offload-lib-suffix}
MAIN_DEPENDENCY ${CMAKE_CURRENT_SOURCE_DIR}/imf_wrapper.cpp
Expand All @@ -453,7 +453,7 @@ add_custom_command(OUTPUT ${obj_binary_dir}/imf-fp64-host.${lib-suffix}
VERBATIM)

add_custom_command(OUTPUT ${obj_binary_dir}/imf-fp64-host.${new-offload-lib-suffix}
COMMAND ${clang} ${imf_host_cxx_flags} --offload-new-driver
COMMAND ${clang} ${imf_host_cxx_flags} --offload-new-driver -foffload-lto=thin
${CMAKE_CURRENT_SOURCE_DIR}/imf_wrapper_fp64.cpp
-o ${obj_binary_dir}/imf-fp64-host.${new-offload-lib-suffix}
MAIN_DEPENDENCY ${CMAKE_CURRENT_SOURCE_DIR}/imf_wrapper_fp64.cpp
Expand All @@ -469,7 +469,7 @@ add_custom_command(OUTPUT ${obj_binary_dir}/imf-bf16-host.${lib-suffix}
VERBATIM)

add_custom_command(OUTPUT ${obj_binary_dir}/imf-bf16-host.${new-offload-lib-suffix}
COMMAND ${clang} ${imf_host_cxx_flags} --offload-new-driver
COMMAND ${clang} ${imf_host_cxx_flags} --offload-new-driver -foffload-lto=thin
${CMAKE_CURRENT_SOURCE_DIR}/imf_wrapper_bf16.cpp
-o ${obj_binary_dir}/imf-bf16-host.${new-offload-lib-suffix}
MAIN_DEPENDENCY ${CMAKE_CURRENT_SOURCE_DIR}/imf_wrapper_bf16.cpp
Expand Down
30 changes: 30 additions & 0 deletions llvm/include/llvm/SYCLLowerIR/ModuleSplitter.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
#include "llvm/ADT/StringRef.h"
#include "llvm/IR/Function.h"
#include "llvm/Support/Error.h"
#include "llvm/Support/PropertySetIO.h"

#include <memory>
#include <string>
Expand Down Expand Up @@ -196,6 +197,8 @@ class ModuleDesc {

ModuleDesc clone() const;

std::string makeSymbolTable() const;

const SYCLDeviceRequirements &getOrComputeDeviceRequirements() const {
if (!Reqs.has_value())
Reqs = computeDeviceRequirements(*this);
Expand Down Expand Up @@ -270,6 +273,33 @@ void dumpEntryPoints(const Module &M, bool OnlyKernelsAreEntryPoints = false,
const char *msg = "", int Tab = 0);
#endif // NDEBUG

struct SplitModule {
std::string ModuleFilePath;
util::PropertySetRegistry Properties;
std::string Symbols;

SplitModule() = default;
SplitModule(const SplitModule &) = default;
SplitModule &operator=(const SplitModule &) = default;
SplitModule(SplitModule &&) = default;
SplitModule &operator=(SplitModule &&) = default;

SplitModule(std::string_view File, util::PropertySetRegistry Properties,
std::string Symbols)
: ModuleFilePath(File), Properties(std::move(Properties)),
Symbols(std::move(Symbols)) {}
};

struct ModuleSplitterSettings {
IRSplitMode Mode;
bool OutputAssembly = false; // Bitcode or LLVM IR.
StringRef OutputPrefix;
};

/// Splits the given module \p M according to the given \p Settings.
Expected<std::vector<SplitModule>>
splitSYCLModule(std::unique_ptr<Module> M, ModuleSplitterSettings Settings);

} // namespace module_split

} // namespace llvm
Expand Down
14 changes: 13 additions & 1 deletion llvm/include/llvm/SYCLLowerIR/SYCLDeviceRequirements.h
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,19 @@ class PropertyValue;
}

struct SYCLDeviceRequirements {
std::set<uint32_t> Aspects;
struct AspectNameValuePair {
llvm::SmallString<64> Name;
uint32_t Value;
AspectNameValuePair(StringRef Name, uint32_t Value)
: Name(Name), Value(Value) {}
bool operator<(const AspectNameValuePair &rhs) const {
return Value < rhs.Value;
}
bool operator==(const AspectNameValuePair &rhs) const {
return Value == rhs.Value;
}
};
std::set<AspectNameValuePair> Aspects;
std::set<uint32_t> FixedTarget;
std::optional<llvm::SmallVector<uint64_t, 3>> ReqdWorkGroupSize;
std::optional<uint32_t> WorkGroupNumDim;
Expand Down
4 changes: 3 additions & 1 deletion llvm/lib/SYCLLowerIR/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -88,10 +88,12 @@ add_llvm_component_library(LLVMSYCLLowerIR
LLVMDemangle
LLVMTargetParser
LLVMTransformUtils

LINK_COMPONENTS
Analysis
BitWriter
Core
IRPrinter
Support
ipo
)
Expand Down
Loading
Loading