Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][Graph][Doc] Add SYCL-Graph usage guide and example doc #379

Closed
wants to merge 35 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
21f7f4e
Revert "[UR][DeviceSantizer] Enable Symoblizer for UR santizer layer …
sarnex Aug 1, 2024
06a3902
[SYCL][ESIMD][E2E] Fix test warning about divide by zero (#14881)
sarnex Aug 1, 2024
ebb7dd9
[SYCL] Fix abi-neutrality test for older libstdc++ versions (#14900)
againull Aug 1, 2024
7b74721
[ABI-Break][SYCL] Fix ext_oneapi_cl_profile to be ABI-neutral (#14883)
bso-intel Aug 1, 2024
c52a633
[SYCL] Catch exceptions thrown in destructors (#14808)
ianayl Aug 2, 2024
06609fc
Re-enable test-e2e/Graph/RecordReplay/kernel_bundle.cpp (#14867)
martygrant Aug 2, 2024
f7b4a88
[SYCL][NFC] change negative test verification for ONEAPI_DEVICE_SELEC…
KseniyaTikhomirova Aug 2, 2024
2321b3a
[SYCL][Bindless] Add interop memory mapping to USM. (#14701)
przemektmalon Aug 2, 2024
13ef711
[SYCL][NFC] Remove execute permissions from text files (#14913)
hdelan Aug 2, 2024
3c0532d
Connect support for dynamic linking to user options (#14575)
LU-JOHN Aug 2, 2024
ee703c8
[SYCL][Bindless] Fix dx12 interop samples for new external type names…
nrspruit Aug 2, 2024
895f116
Enable dx12 interop testing on windows level zero (#14861)
nrspruit Aug 2, 2024
4f86ab7
[SYCL] Change NativePrograms.insert to `[]` access (#14873)
RossBrunton Aug 2, 2024
a9fe9ec
[NFC][SYCL] Remove unused `detail::Boolean` (#14904)
aelovikov-intel Aug 2, 2024
1d49a15
[SYCL] Revert NativePrograms addition for linkedPrograms to insert (#…
omarahmed1111 Aug 2, 2024
5c9450b
[SYCL] Move MSVC flags setting before FetchUR (#14876)
jsji Aug 2, 2024
e0ef0d7
[SYCL][NFC] Fix doxygen annotation for graph method description (#14889)
bader Aug 3, 2024
433b70c
[SYCL] Include sycl path in Bindless image hpp (#14903)
jsji Aug 5, 2024
e48c122
[CI][CTS] Turn on test_queue & spec_constants (#14880)
KornevNikita Aug 5, 2024
429b01d
Remove unused RSBench tests (#14935)
martygrant Aug 5, 2024
6127715
[SYCL] Implement device image properties for virtual functions (#14875)
AlexeySachkov Aug 5, 2024
232c95c
[SYCL][NewOffloadingModel] Add sycl-dump-device-code command line opt…
maksimsab Aug 5, 2024
c455b6f
[SYCL][E2E][Bindless] Fix -Werror issues (#14912)
ProGTX Aug 5, 2024
727e085
[SYCL][E2E] Mark free_function tests unsupported on HIP (#14917)
konradkusiak97 Aug 5, 2024
3d73d9b
[NVPTX][AMDGPU] Move annotation creation out of clang (#14634)
frasercrmck Aug 5, 2024
1edc943
[NFC][Driver] Remove unneeded headers (#14882)
mdtoguchi Aug 5, 2024
7d3ac99
[SYCL][InvokeSimd] Fix two XFAIL tests (#14906)
sarnex Aug 5, 2024
23fa10d
[SYCL][ESIMD][E2E] Re-enable ctor_load_usm_fp_extra.cpp (#14929)
sarnex Aug 5, 2024
ffd443c
[SYCL][Graph] Fix E2E disabled RUN lines (#14902)
EwanC Aug 6, 2024
1688e41
[SYCL] Rename sycl-fusion to sycl-jit (#14762)
jchlanda Aug 6, 2024
ce23765
[UR] Bump UR to 9b93cb1c (#14944)
omarahmed1111 Aug 6, 2024
81c87e5
[SYCL] Switch NativePrograms back to using multimap (#14951)
RossBrunton Aug 6, 2024
656aa7a
[Doc] Update sycl_ext_oneapi_atomic16.asciidoc (#14952)
KornevNikita Aug 6, 2024
b575283
[SYCL] Add missing test dependency on FileCheck (#14905)
ldrumm Aug 6, 2024
abb785a
[SYCL][Graph][Doc] Add SYCL-Graph usage guide and example doc
Bensuo Jul 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ buildbot/ @intel/dpcpp-devops-reviewers
devops/ @intel/dpcpp-devops-reviewers

# Kernel fusion JIT compiler
sycl-fusion/ @intel/dpcpp-kernel-fusion-reviewers
sycl-jit/ @intel/dpcpp-kernel-fusion-reviewers
sycl/doc/design/KernelFusionJIT.md @intel/dpcpp-kernel-fusion-reviewers
sycl/doc/extensions/experimental/sycl_ext_codeplay_kernel_fusion.asciidoc @intel/dpcpp-kernel-fusion-reviewers
sycl/include/sycl/ext/codeplay/experimental/fusion_properties.hpp @intel/dpcpp-kernel-fusion-reviewers
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/sycl-detect-changes.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,9 @@ jobs:
clang: &clang
- *llvm
- 'clang/**'
sycl_fusion: &sycl-fusion
sycl_jit: &sycl-jit
- *llvm
- 'sycl-fusion/**'
- 'sycl-jit/**'
xptifw: &xptifw
- 'xptifw/**'
libclc: &libclc
Expand All @@ -41,7 +41,7 @@ jobs:
- 'libclc/**'
sycl: &sycl
- *clang
- *sycl-fusion
- *sycl-jit
- *llvm_spirv
- *xptifw
- *libclc
Expand Down Expand Up @@ -84,7 +84,7 @@ jobs:
return '${{ steps.changes.outputs.changes }}';
}
// Treat everything as changed for huge PRs.
return ["llvm", "llvm_spirv", "clang", "sycl_fusion", "xptifw", "libclc", "sycl", "ci", "esimd"];
return ["llvm", "llvm_spirv", "clang", "sycl_jit", "xptifw", "libclc", "sycl", "ci", "esimd"];

- run: echo '${{ steps.result.outputs.result }}'

18 changes: 9 additions & 9 deletions buildbot/configure.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,10 @@ def do_configure(args):
libclc_amd_target_names = ";amdgcn--amdhsa"
libclc_nvidia_target_names = ";nvptx64--nvidiacl"

sycl_enable_fusion = "OFF"
if not args.disable_fusion:
llvm_external_projects += ";sycl-fusion"
sycl_enable_fusion = "ON"
sycl_enable_jit = "OFF"
if not args.disable_jit:
llvm_external_projects += ";sycl-jit"
sycl_enable_jit = "ON"

if args.llvm_external_projects:
llvm_external_projects += ";" + args.llvm_external_projects.replace(",", ";")
Expand All @@ -45,7 +45,7 @@ def do_configure(args):
xpti_dir = os.path.join(abs_src_dir, "xpti")
xptifw_dir = os.path.join(abs_src_dir, "xptifw")
libdevice_dir = os.path.join(abs_src_dir, "libdevice")
fusion_dir = os.path.join(abs_src_dir, "sycl-fusion")
jit_dir = os.path.join(abs_src_dir, "sycl-jit")
llvm_targets_to_build = args.host_target
llvm_enable_projects = "clang;" + llvm_external_projects
libclc_build_native = "OFF"
Expand Down Expand Up @@ -174,7 +174,7 @@ def do_configure(args):
"-DXPTI_SOURCE_DIR={}".format(xpti_dir),
"-DLLVM_EXTERNAL_XPTIFW_SOURCE_DIR={}".format(xptifw_dir),
"-DLLVM_EXTERNAL_LIBDEVICE_SOURCE_DIR={}".format(libdevice_dir),
"-DLLVM_EXTERNAL_SYCL_FUSION_SOURCE_DIR={}".format(fusion_dir),
"-DLLVM_EXTERNAL_SYCL_JIT_SOURCE_DIR={}".format(jit_dir),
"-DLLVM_ENABLE_PROJECTS={}".format(llvm_enable_projects),
"-DSYCL_BUILD_PI_HIP_PLATFORM={}".format(sycl_build_pi_hip_platform),
"-DLLVM_BUILD_TOOLS=ON",
Expand All @@ -189,7 +189,7 @@ def do_configure(args):
"-DXPTI_ENABLE_WERROR={}".format(xpti_enable_werror),
"-DSYCL_CLANG_EXTRA_FLAGS={}".format(sycl_clang_extra_flags),
"-DSYCL_ENABLE_PLUGINS={}".format(";".join(set(sycl_enabled_plugins))),
"-DSYCL_ENABLE_KERNEL_FUSION={}".format(sycl_enable_fusion),
"-DSYCL_ENABLE_EXTENSION_JIT={}".format(sycl_enable_jit),
"-DSYCL_ENABLE_MAJOR_RELEASE_PREVIEW_LIB={}".format(sycl_preview_lib),
"-DBUG_REPORT_URL=https://github.com/intel/llvm/issues",
]
Expand Down Expand Up @@ -379,9 +379,9 @@ def main():
help="Disable building of the SYCL runtime major release preview library",
)
parser.add_argument(
"--disable-fusion",
"--disable-jit",
action="store_true",
help="Disable the kernel fusion JIT compiler",
help="Disable the kernel JIT compiler for AMD and Nvidia",
)
parser.add_argument(
"--add_security_flags",
Expand Down
4 changes: 4 additions & 0 deletions clang/include/clang/Driver/Options.td
Original file line number Diff line number Diff line change
Expand Up @@ -4189,6 +4189,10 @@ def fsycl_remove_unused_external_funcs : Flag<["-"], "fsycl-remove-unused-extern
Group<sycl_Group>, HelpText<"Allow removal of unused `SYCL_EXTERNAL` functions (default)">;
def fno_sycl_remove_unused_external_funcs : Flag<["-"], "fno-sycl-remove-unused-external-funcs">,
Group<sycl_Group>, HelpText<"Prevent removal of unused `SYCL_EXTERNAL` functions">;
def fsycl_allow_device_dependencies : Flag<["-"], "fsycl-allow-device-dependencies">,
Group<sycl_Group>, HelpText<"Allow dependencies between device code images">;
def fno_sycl_allow_device_dependencies : Flag<["-"], "fno-sycl-allow-device-dependencies">,
Group<sycl_Group>, HelpText<"Do not allow dependencies between device code images (default)">;

def fsave_optimization_record : Flag<["-"], "fsave-optimization-record">,
Visibility<[ClangOption, FlangOption]>,
Expand Down
4 changes: 4 additions & 0 deletions clang/lib/CodeGen/BackendUtil.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@
#include "llvm/SYCLLowerIR/RecordSYCLAspectNames.h"
#include "llvm/SYCLLowerIR/SYCLAddOptLevelAttribute.h"
#include "llvm/SYCLLowerIR/SYCLConditionalCallOnDevice.h"
#include "llvm/SYCLLowerIR/SYCLCreateNVVMAnnotations.h"
#include "llvm/SYCLLowerIR/SYCLPropagateAspectsUsage.h"
#include "llvm/SYCLLowerIR/SYCLPropagateJointMatrixUsage.h"
#include "llvm/SYCLLowerIR/SYCLVirtualFunctionsAnalysis.h"
Expand Down Expand Up @@ -1151,6 +1152,9 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
// and before cleaning up metadata)
MPM.addPass(RecordSYCLAspectNamesPass());

if (TargetTriple.isNVPTX())
MPM.addPass(SYCLCreateNVVMAnnotationsPass());

// Remove SYCL metadata added by the frontend, like sycl_aspects
// Note, this pass should be at the end of the pipeline
MPM.addPass(CleanupSYCLMetadataPass());
Expand Down
74 changes: 0 additions & 74 deletions clang/lib/CodeGen/Targets/NVPTX.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -291,80 +291,6 @@ void NVPTXTargetCodeGenInfo::setTargetAttributes(
addNVVMMetadata(F, "grid_constant", GridConstantParamIdxs);
}
}
bool HasMaxWorkGroupSize = false;
bool HasMinWorkGroupPerCU = false;
if (const auto *MWGS = FD->getAttr<SYCLIntelMaxWorkGroupSizeAttr>()) {
HasMaxWorkGroupSize = true;
// We must index-flip between SYCL's notation, X,Y,Z (aka dim0,dim1,dim2)
// with the fastest-moving dimension rightmost, to CUDA's, where X is the
// fastest-moving dimension.
addNVVMMetadata(F, "maxntidx", MWGS->getZDimVal());
addNVVMMetadata(F, "maxntidy", MWGS->getYDimVal());
addNVVMMetadata(F, "maxntidz", MWGS->getXDimVal());
}

if (const auto *RWGS = FD->getAttr<SYCLReqdWorkGroupSizeAttr>()) {
llvm::SmallVector<std::optional<int64_t>, 3> Ops;
// Index-flip and pad out any missing elements. Note the misleading
// nomenclature of the methods: getXDimVal doesn't return the X dimension;
// it returns the left-most dimension (dim0). This could correspond to
// CUDA's X, Y, or Z, depending on the number of operands provided.
if (auto Dim0 = RWGS->getXDimVal())
Ops.push_back(Dim0->getExtValue());
if (auto Dim1 = RWGS->getYDimVal())
Ops.push_back(Dim1->getExtValue());
if (auto Dim2 = RWGS->getZDimVal())
Ops.push_back(Dim2->getExtValue());
std::reverse(Ops.begin(), Ops.end());
Ops.append(3 - Ops.size(), std::nullopt);

// Work-group sizes (in NVVM annotations) must be positive and less than
// INT32_MAX, whereas SYCL can allow for larger work-group sizes (see
// -fno-sycl-id-queries-fit-in-int). If any dimension is too large for
// NVPTX, don't emit any annotation at all.
if (llvm::all_of(Ops, [](std::optional<int64_t> V) {
return !V || llvm::isUInt<31>(*V);
})) {
if (auto X = Ops[0])
addNVVMMetadata(F, "reqntidx", *X);
if (auto Y = Ops[1])
addNVVMMetadata(F, "reqntidy", *Y);
if (auto Z = Ops[2])
addNVVMMetadata(F, "reqntidz", *Z);
}
}

auto attrValue = [&](Expr *E) {
const auto *CE = cast<ConstantExpr>(E);
std::optional<llvm::APInt> Val = CE->getResultAsAPSInt();
return Val->getZExtValue();
};

if (const auto *MWGPCU =
FD->getAttr<SYCLIntelMinWorkGroupsPerComputeUnitAttr>()) {
if (!HasMaxWorkGroupSize && FD->hasAttr<OpenCLKernelAttr>()) {
M.getDiags().Report(D->getLocation(),
diag::warn_launch_bounds_missing_attr)
<< MWGPCU << 0;
} else {
// The value is guaranteed to be > 0, pass it to the metadata.
addNVVMMetadata(F, "minctasm", attrValue(MWGPCU->getValue()));
HasMinWorkGroupPerCU = true;
}
}

if (const auto *MWGPMP =
FD->getAttr<SYCLIntelMaxWorkGroupsPerMultiprocessorAttr>()) {
if ((!HasMaxWorkGroupSize || !HasMinWorkGroupPerCU) &&
FD->hasAttr<OpenCLKernelAttr>()) {
M.getDiags().Report(D->getLocation(),
diag::warn_launch_bounds_missing_attr)
<< MWGPMP << 1;
} else {
// The value is guaranteed to be > 0, pass it to the metadata.
addNVVMMetadata(F, "maxclusterrank", attrValue(MWGPMP->getValue()));
}
}
}

// Perform special handling in CUDA mode.
Expand Down
2 changes: 0 additions & 2 deletions clang/lib/Driver/Driver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -101,8 +101,6 @@
#include <map>
#include <memory>
#include <optional>
#include <regex>
#include <sstream>
#include <set>
#include <utility>
#if LLVM_ON_UNIX
Expand Down
21 changes: 21 additions & 0 deletions clang/lib/Driver/ToolChains/Clang.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10703,6 +10703,14 @@ static void addArgs(ArgStringList &DstArgs, const llvm::opt::ArgList &Alloc,
}
}

static bool supportDynamicLinking(const llvm::opt::ArgList &TCArgs) {
if (TCArgs.hasFlag(options::OPT_fsycl_allow_device_dependencies,
options::OPT_fno_sycl_allow_device_dependencies,
false))
return true;
return false;
}

static void getNonTripleBasedSYCLPostLinkOpts(const ToolChain &TC,
const JobAction &JA,
const llvm::opt::ArgList &TCArgs,
Expand All @@ -10729,6 +10737,9 @@ static void getNonTripleBasedSYCLPostLinkOpts(const ToolChain &TC,
if (TCArgs.hasFlag(options::OPT_fno_sycl_esimd_force_stateless_mem,
options::OPT_fsycl_esimd_force_stateless_mem, false))
addArgs(PostLinkArgs, TCArgs, {"-lower-esimd-force-stateless-mem=false"});

if (supportDynamicLinking(TCArgs))
addArgs(PostLinkArgs, TCArgs, {"-support-dynamic-linking"});
}

// Add any sycl-post-link options that rely on a specific Triple in addition
Expand Down Expand Up @@ -10776,6 +10787,8 @@ static void getTripleBasedSYCLPostLinkOpts(const ToolChain &TC,
options::OPT_fsycl_remove_unused_external_funcs,
false) &&
!isSYCLNativeCPU(TC)) &&
// When supporting dynamic linking, non-kernels in a device image can be called
!supportDynamicLinking(TCArgs) &&
!Triple.isNVPTX() && !Triple.isAMDGPU())
addArgs(PostLinkArgs, TCArgs, {"-emit-only-kernels-as-entry-points"});

Expand Down Expand Up @@ -11199,6 +11212,14 @@ void LinkerWrapper::ConstructJob(Compilation &C, const JobAction &JA,
CmdArgs.push_back(Args.MakeArgString(
Twine("-sycl-device-library-location=") + DeviceLibDir));

if (C.getDriver().isDumpDeviceCodeEnabled()) {
SmallString<128> DumpDir;
Arg *A = C.getArgs().getLastArg(options::OPT_fsycl_dump_device_code_EQ);
DumpDir = A ? A->getValue() : "";
CmdArgs.push_back(
Args.MakeArgString(Twine("-sycl-dump-device-code=") + DumpDir));
}

auto appendOption = [](SmallString<128> &OptString, StringRef AddOpt) {
if (!OptString.empty())
OptString += " ";
Expand Down
22 changes: 17 additions & 5 deletions clang/lib/Sema/SemaDeclAttr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4089,11 +4089,17 @@ bool static check32BitInt(const Expr *E, Sema &S, llvm::APSInt &I,

void Sema::AddSYCLIntelMinWorkGroupsPerComputeUnitAttr(
Decl *D, const AttributeCommonInfo &CI, Expr *E) {
if (Context.getLangOpts().SYCLIsDevice &&
!Context.getTargetInfo().getTriple().isNVPTX()) {
Diag(E->getBeginLoc(), diag::warn_launch_bounds_is_cuda_specific)
<< CI << E->getSourceRange();
return;
if (Context.getLangOpts().SYCLIsDevice) {
if (!Context.getTargetInfo().getTriple().isNVPTX()) {
Diag(E->getBeginLoc(), diag::warn_launch_bounds_is_cuda_specific)
<< CI << E->getSourceRange();
return;
}

if (!D->hasAttr<SYCLIntelMaxWorkGroupSizeAttr>()) {
Diag(CI.getLoc(), diag::warn_launch_bounds_missing_attr) << CI << 0;
return;
}
}
if (!E->isValueDependent()) {
// Validate that we have an integer constant expression and then store the
Expand Down Expand Up @@ -4154,6 +4160,12 @@ void Sema::AddSYCLIntelMaxWorkGroupsPerMultiprocessorAttr(
<< CudaArchToString(SM) << CI << E->getSourceRange();
return;
}

if (!D->hasAttr<SYCLIntelMaxWorkGroupSizeAttr>() ||
!D->hasAttr<SYCLIntelMinWorkGroupsPerComputeUnitAttr>()) {
Diag(CI.getLoc(), diag::warn_launch_bounds_missing_attr) << CI << 1;
return;
}
}
if (!E->isValueDependent()) {
// Validate that we have an integer constant expression and then store the
Expand Down
31 changes: 0 additions & 31 deletions clang/test/CodeGenSYCL/launch_bounds_nvptx.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -48,37 +48,6 @@ int main() {
// CHECK: define dso_local void @{{.*}}kernel_name2() #0 {{.*}} !min_work_groups_per_cu ![[MWGPC:[0-9]+]] !max_work_groups_per_mp ![[MWGPM:[0-9]+]] !max_work_group_size ![[MWGS:[0-9]+]]
// CHECK: define dso_local void @{{.*}}kernel_name3() #0 {{.*}} !min_work_groups_per_cu ![[MWGPC_MWGPM:[0-9]+]] !max_work_groups_per_mp ![[MWGPC_MWGPM]] !max_work_group_size ![[MWGS_2:[0-9]+]]

// CHECK: {{.*}}@{{.*}}kernel_name1, !"maxntidx", i32 8}
// CHECK: {{.*}}@{{.*}}kernel_name1, !"maxntidy", i32 4}
// CHECK: {{.*}}@{{.*}}kernel_name1, !"maxntidz", i32 2}
// CHECK: {{.*}}@{{.*}}kernel_name1, !"minctasm", i32 2}
// CHECK: {{.*}}@{{.*}}kernel_name1, !"maxclusterrank", i32 4}
// CHECK: {{.*}}@{{.*}}Foo{{.*}}, !"maxntidx", i32 8}
// CHECK: {{.*}}@{{.*}}Foo{{.*}}, !"maxntidy", i32 4}
// CHECK: {{.*}}@{{.*}}Foo{{.*}}, !"maxntidz", i32 2}
// CHECK: {{.*}}@{{.*}}Foo{{.*}}, !"minctasm", i32 2}
// CHECK: {{.*}}@{{.*}}Foo{{.*}}, !"maxclusterrank", i32 4}
// CHECK: {{.*}}@{{.*}}kernel_name2, !"maxntidx", i32 8}
// CHECK: {{.*}}@{{.*}}kernel_name2, !"maxntidy", i32 4}
// CHECK: {{.*}}@{{.*}}kernel_name2, !"maxntidz", i32 2}
// CHECK: {{.*}}@{{.*}}kernel_name2, !"minctasm", i32 2}
// CHECK: {{.*}}@{{.*}}kernel_name2, !"maxclusterrank", i32 4}
// CHECK: {{.*}}@{{.*}}main{{.*}}, !"maxntidx", i32 8}
// CHECK: {{.*}}@{{.*}}main{{.*}}, !"maxntidy", i32 4}
// CHECK: {{.*}}@{{.*}}main{{.*}}, !"maxntidz", i32 2}
// CHECK: {{.*}}@{{.*}}main{{.*}}, !"minctasm", i32 2}
// CHECK: {{.*}}@{{.*}}main{{.*}}, !"maxclusterrank", i32 4}
// CHECK: {{.*}}@{{.*}}kernel_name3, !"maxntidx", i32 8}
// CHECK: {{.*}}@{{.*}}kernel_name3, !"maxntidy", i32 4}
// CHECK: {{.*}}@{{.*}}kernel_name3, !"maxntidz", i32 6}
// CHECK: {{.*}}@{{.*}}kernel_name3, !"minctasm", i32 6}
// CHECK: {{.*}}@{{.*}}kernel_name3, !"maxclusterrank", i32 6}
// CHECK: {{.*}}@{{.*}}Functor{{.*}}, !"maxntidx", i32 8}
// CHECK: {{.*}}@{{.*}}Functor{{.*}}, !"maxntidy", i32 4}
// CHECK: {{.*}}@{{.*}}Functor{{.*}}, !"maxntidz", i32 6}
// CHECK: {{.*}}@{{.*}}Functor{{.*}}, !"minctasm", i32 6}
// CHECK: {{.*}}@{{.*}}Functor{{.*}}, !"maxclusterrank", i32 6}

// CHECK: ![[MWGPC]] = !{i32 2}
// CHECK: ![[MWGPM]] = !{i32 4}
// CHECK: ![[MWGS]] = !{i32 8, i32 4, i32 2}
Expand Down
Loading
Loading