Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Enable FPGA on postcommit for Linux #12673

Merged
merged 14 commits into from
Feb 21, 2024
2 changes: 1 addition & 1 deletion .github/workflows/sycl-linux-precommit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ jobs:
runner: '["Linux", "gen12"]'
image: ghcr.io/intel/llvm/ubuntu2204_intel_drivers:latest
image_options: -u 1001 --device=/dev/dri --privileged --cap-add SYS_ADMIN
target_devices: ext_oneapi_level_zero:gpu;opencl:gpu;opencl:cpu
target_devices: ext_oneapi_level_zero:gpu;opencl:gpu;opencl:cpu;opencl:fpga
reset_gpu: true
install_drivers: ${{ contains(needs.detect_changes.outputs.filters, 'drivers') }}
extra_lit_opts: --param gpu-intel-gen12=True
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/sycl-linux-run-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ on:
options:
- 'opencl:cpu'
- 'opencl:gpu'
- 'opencl:acc'
- 'opencl:fpga'
- 'ext_oneapi_level_zero:gpu'
- 'ext_oneapi_hip:gpu'
tests_selector:
Expand Down
3 changes: 3 additions & 0 deletions sycl/test-e2e/Assert/assert_in_kernels.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@

// https://github.com/intel/llvm/issues/7634
// UNSUPPORTED: hip
// TODO: Remove unsupported after fixing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, but it looks like we shouldn't turn off Assert/* tests on acc as it's some environment issue, see #12683 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you try to change
// RUN: %{run} %t.out &> %t.txt ; FileCheck %s --input-file %t.txt %if acc %{ --check-prefix=CHECK-ACC %}
to
// RUN: %{run} %t.out &> %t.txt ; FileCheck %s --input-file %t.txt %if accelerator %{ --check-prefix=CHECK-ACC %}
instead of adding UNSUPPORTED to Assert/* tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, '%if accelerator' didn't work. The assert tests are still failing because if wrong check prefix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be %fpga

// https://github.com/intel/llvm/issues/12683
// UNSUPPORTED: accelerator
//
// FIXME: Remove XFAIL one intel/llvm#11364 is resolved
// XFAIL: (opencl && gpu)
Expand Down
3 changes: 3 additions & 0 deletions sycl/test-e2e/Assert/assert_in_multiple_tus.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@
//
// https://github.com/intel/llvm/issues/8832
// UNSUPPORTED: cuda
// TODO: Remove unsupported after fixing
// https://github.com/intel/llvm/issues/12683
// UNSUPPORTED: accelerator
//
// FIXME: Remove XFAIL one intel/llvm#11364 is resolved
// XFAIL: (opencl && gpu)
Expand Down
3 changes: 3 additions & 0 deletions sycl/test-e2e/Assert/assert_in_multiple_tus_one_ndebug.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@
//
// https://github.com/intel/llvm/issues/8832
// UNSUPPORTED: cuda
// TODO: Remove unsupported after fixing
// https://github.com/intel/llvm/issues/12683
// UNSUPPORTED: accelerator
//
// FIXME: Remove XFAIL one intel/llvm#11364 is resolved
// XFAIL: (opencl && gpu)
Expand Down
3 changes: 3 additions & 0 deletions sycl/test-e2e/Assert/assert_in_one_kernel.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@

// https://github.com/intel/llvm/issues/7634
// UNSUPPORTED: hip
// TODO: Remove unsupported after fixing
// https://github.com/intel/llvm/issues/12683
// UNSUPPORTED: accelerator
//
// FIXME: Remove XFAIL one intel/llvm#11364 is resolved
// XFAIL: (opencl && gpu)
Expand Down
3 changes: 3 additions & 0 deletions sycl/test-e2e/Assert/assert_in_simultaneous_kernels.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@
// FIXME: Flaky on HIP and cuda
// UNSUPPORTED: hip || cuda
// RUN: %{build} -DSYCL_FALLBACK_ASSERT=1 -o %t.out %threads_lib
// TODO: Remove unsupported after fixing
// https://github.com/intel/llvm/issues/12683
// UNSUPPORTED: accelerator
//
// FIXME: Remove XFAIL one intel/llvm#11364 is resolved
// XFAIL: (opencl && gpu)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
// FIXME flaky fail on CUDA and HIP
// UNSUPPORTED: cuda || hip
// TODO: Remove unsupported after fixing
// https://github.com/intel/llvm/issues/12683
// UNSUPPORTED: accelerator
//
// FIXME: Remove XFAIL one intel/llvm#11364 is resolved
// XFAIL: (opencl && gpu)
Expand Down
5 changes: 4 additions & 1 deletion sycl/test-e2e/DeviceLib/string_test.cpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
// UNSUPPORTED: hip
// RUN: %{build} -fno-builtin -o %t.out
// RUN: %{run} %t.out

// TODO: Remove unsupported after fixing
// https://github.com/intel/llvm/issues/12683
// UNSUPPORTED: accelerator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accelerator or fpga?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's accelerator only because that's what the name of the feature is (https://github.com/intel/llvm/blob/sycl/sycl/test-e2e/lit.cfg.py#L678) when fpga device is reported by sycl-ls.

//
// RUN: %{build} -fno-builtin -fsycl-device-lib-jit-link -o %t.out
// RUN: %if !gpu %{ %{run} %t.out %}

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
// REQUIRES: gpu-intel-gen12
// TODO: Remove unsupported after fixing
// https://github.com/intel/llvm/issues/12683
// UNSUPPORTED: accelerator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean we can remove "// UNSUPPORTED: accelerator" here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this test failed on opencl:fpga with the following error. That's why I added "UNSUPPORTED: accelerator". I'm not sure why you are suggesting removing this?

2024-02-09T00:53:18.1728039Z env ONEAPI_DEVICE_SELECTOR=opencl:fpga  /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_opt_kernel_feature_unsupported_hw.cpp.tmp.out
2024-02-09T00:53:18.1730184Z # executed command: env ONEAPI_DEVICE_SELECTOR=opencl:fpga /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_opt_kernel_feature_unsupported_hw.cpp.tmp.out
2024-02-09T00:53:18.1731519Z # .---command stderr------------
2024-02-09T00:53:18.1734608Z # | joint_matrix_opt_kernel_feature_unsupported_hw.cpp.tmp.out: /__w/llvm/llvm/llvm/sycl/test-e2e/Matrix/joint_matrix_opt_kernel_feature_unsupported_hw.cpp:27: int main(): Assertion `(e.code() == sycl::errc::kernel_not_supported) && (std::string(e.what()) == std::string("no matrix hardware on the target device, joint_matrix " "is not supported"))' failed.
2024-02-09T00:53:18.1737560Z # `-----------------------------

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought your PR fixes #12683
which is why we added unsupported for FPGA.
Adding @dm-vodopyanov since he is the one who added unsupported in the first place for this test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is supposed to be run on HW which doesn't support Joint Matrix and verifies that correct exception was generated. So, I would expect that UNSUPPORTED should not be required for accelerator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So, should we add XFAIL for FPGA instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need @dm-vodopyanov to look at why it is failing as the author of the feature/test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @uditagarwal97, I don't fully understand, why the test even launched on opencl:fpga if we have // REQUIRES: gpu-intel-gen12?
The rationale of this test is to be executed only on Gen12 and to make sure that the expected exception is thrown. No test updates (like the update here) should be made as it is expected that test does not support other devices, especially FPGA which is not a target for the matrices.

@aelovikov-intel having // REQUIRES: gpu-vendor-abc in test source means that

  • the test will be executed only if GPU on the testing machine is vendor-abc AND the test will be executed only on GPU device,
  • or the test will be executed only if GPU on the testing machine is vendor-abc AND will be executed on all available devices?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's passed like llvm-lit --param gpu-vendor-abc=True and that sets the feature unconditionally for all the devices (not handled at the format.py level). If one wants to limit to gpu only, an additional REQUIRES: gpu is needed.

Copy link
Contributor

@dm-vodopyanov dm-vodopyanov Feb 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @aelovikov-intel.
@uditagarwal97: can you please remove these lines 2-4 added in this patch and modify the line 1 by adding , gpu to it? After that this test will be skipped for the accelerator device.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure


// RUN: %{build} -o %t.out -DSYCL_EXT_ONEAPI_MATRIX_VERSION=4
// RUN: %{run} %t.out
Expand Down
3 changes: 3 additions & 0 deletions sycl/test-e2e/syclcompat/atomic/atomic_arith.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@
// ===----------------------------------------------------------------------===//

// UNSUPPORTED: hip
// TODO: Remove unsupported after fixing
// https://github.com/intel/llvm/issues/12683
// UNSUPPORTED: accelerator
Alcpz marked this conversation as resolved.
Show resolved Hide resolved

// RUN: %clangxx -std=c++20 -fsycl -fsycl-targets=%{sycl_triple} %s -o %t.out
Alcpz marked this conversation as resolved.
Show resolved Hide resolved
// RUN: %{run} %t.out
Expand Down
3 changes: 3 additions & 0 deletions sycl/test-e2e/syclcompat/atomic/atomic_comp_exchange.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@
// ===----------------------------------------------------------------------===//

// UNSUPPORTED: hip
// TODO: Remove unsupported after fixing
// https://github.com/intel/llvm/issues/12683
// UNSUPPORTED: accelerator

// RUN: %clangxx -std=c++20 -fsycl -fsycl-targets=%{sycl_triple} %s -o %t.out
Alcpz marked this conversation as resolved.
Show resolved Hide resolved
// RUN: %{run} %t.out
Expand Down
Loading