[SYCL] Fix post-commit failure #13657

konradkusiak97 · 2024-05-06T08:22:32Z

No description provided.

aelovikov-intel · 2024-05-08T21:29:13Z

@konradkusiak97 , why can't this be tested locally? Our CI has limited resources, we wouldn't want the entire development be done in CI. Logs for post-commit CI tasks contain options passed to buildbot/configure.py and that should provide you enough information to reproduce this locally.

konradkusiak97 · 2024-05-08T21:36:38Z

@konradkusiak97 , why can't this be tested locally? Our CI has limited resources, we wouldn't want the entire development be done in CI. Logs for post-commit CI tasks contain options passed to buildbot/configure.py and that should provide you enough information to reproduce this locally.

@aelovikov-intel, one of the previous failures was only appearing in the post-commit CI and wasn't reproducible locally which is why I was trying to get some insight into the problem here. But I understand your concern about the limited resources of the CI. I'll close this PR and try to work on the fix locally.

aelovikov-intel · 2024-05-08T21:40:54Z

@konradkusiak97 , why can't this be tested locally? Our CI has limited resources, we wouldn't want the entire development be done in CI. Logs for post-commit CI tasks contain options passed to buildbot/configure.py and that should provide you enough information to reproduce this locally.

@aelovikov-intel, one of the previous failures was only appearing in the post-commit CI and wasn't reproducible locally which is why I was trying to get some insight into the problem here. But I understand your concern about the limited resources of the CI. I'll close this PR and try to work on the fix locally.

I don't mind using CI this way if the failure is specific to the HW we use in CI. Otherwise, I'd be happy to help you reproduce locally or share a way to trigger post-commit only (without running pre-commit in addition to post).

konradkusiak97 · 2024-05-08T21:45:25Z

@konradkusiak97 , why can't this be tested locally? Our CI has limited resources, we wouldn't want the entire development be done in CI. Logs for post-commit CI tasks contain options passed to buildbot/configure.py and that should provide you enough information to reproduce this locally.

@aelovikov-intel, one of the previous failures was only appearing in the post-commit CI and wasn't reproducible locally which is why I was trying to get some insight into the problem here. But I understand your concern about the limited resources of the CI. I'll close this PR and try to work on the fix locally.

I don't mind using CI this way if the failure is specific to the HW we use in CI. Otherwise, I'd be happy to help you reproduce locally or share a way to trigger post-commit only (without running pre-commit in addition to post).

That sounds great, thanks. I'll get back to you if my efforts to reproduce this locally will turn out to be fully unsuccessful and I'll need further help.

konradkusiak97 · 2024-05-10T11:40:41Z

I didn't manage to reproduce this failure on a different Intel gpus that I have access to. It seems it is specific to Iris(R) Xe Graphics unless something different is going on.

For that latest post-commit run, I modified test-e2e/out_of_order_queue_status.cpp to use Q.memset() instead of Q.fill() and the test fails again, with the same error message as after merging #12702 which means there might be a bug in the current memset implementation.

I've noticed that compared to the same test passing on the pre-commit run, the DPC++ is built with clang/clang++ on the post-commit which might be the reason for the difference. I tried reproducing it running on:

[level_zero:gpu] Intel(R) Level-Zero, Intel(R) UHD Graphics 770 1.3 [1.3.29138]

with this configuration:

export CXX=clang++
export CC=clang

python ../llvm/buildbot/configure.py -t Release --ci-defaults --shared-libs --no-assertions \
 --cmake-opt="-DSYCL_ENABLE_STACK_PRINTING=ON" \
 --cmake-opt="-DSYCL_LIB_WITH_DEBUG_SYMBOL=ON" \
 --cmake-opt="-DLLVM_INSTALL_UTILS=ON" \
 --cmake-opt="-DNATIVECPU_USE_OCK=Off" \
 --cmake-opt="-DSYCL_PI_TESTS=OFF" \
-o ./

and running the test with:

clang++ -fsycl -fsycl-targets=spir64 out_of_order_queue_status_memset.cpp -o res
env ONEAPI_DEVICE_SELECTOR=level_zero:gpu ./res

@aelovikov-intel do you think it might be a bug specific to Iris(R) Xe Graphics? If so, I'll file a bug report.

aelovikov-intel · 2024-05-13T16:24:29Z

@aelovikov-intel do you think it might be a bug specific to Iris(R) Xe Graphics? If so, I'll file a bug report.

Not sure I understand this. Who/what component do you want to file that bug against?

konradkusiak97 · 2024-05-13T17:02:02Z

@aelovikov-intel do you think it might be a bug specific to Iris(R) Xe Graphics? If so, I'll file a bug report.

Not sure I understand this. Who/what component do you want to file that bug against?

The test which I added in this PR: out_of_order_queue_status_memset.cpp uses Q.memset instead of Q.fill and produces the same failure as after merging #12702.

So the faulty component would be the current implementation of memset. The failure is specific to the Level-zero backend on Iris(R) Xe Graphics and DPC++ built with clang.

Copied PR changes

48b70e8

konradkusiak97 temporarily deployed to WindowsCILock May 6, 2024 08:22 — with GitHub Actions Inactive

konradkusiak97 temporarily deployed to WindowsCILock May 6, 2024 08:23 — with GitHub Actions Inactive

konradkusiak97 temporarily deployed to WindowsCILock May 6, 2024 08:36 — with GitHub Actions Inactive

konradkusiak97 temporarily deployed to WindowsCILock May 6, 2024 09:04 — with GitHub Actions Inactive

Fixed pattern vector in fill_usm overload

d5565da

konradkusiak97 temporarily deployed to WindowsCILock May 6, 2024 10:56 — with GitHub Actions Inactive

konradkusiak97 had a problem deploying to WindowsCILock May 6, 2024 10:58 — with GitHub Actions Error

Fixed bug in fill_usm overloads

75f9400

konradkusiak97 temporarily deployed to WindowsCILock May 6, 2024 11:14 — with GitHub Actions Inactive

konradkusiak97 temporarily deployed to WindowsCILock May 6, 2024 11:27 — with GitHub Actions Inactive

konradkusiak97 temporarily deployed to WindowsCILock May 6, 2024 11:55 — with GitHub Actions Inactive

konradkusiak97 temporarily deployed to WindowsCILock May 6, 2024 11:58 — with GitHub Actions Inactive

Called fill through submitMemOpHelper

cde0ccb

konradkusiak97 temporarily deployed to WindowsCILock May 6, 2024 14:40 — with GitHub Actions Inactive

konradkusiak97 temporarily deployed to WindowsCILock May 6, 2024 14:51 — with GitHub Actions Inactive

konradkusiak97 had a problem deploying to WindowsCILock May 6, 2024 15:31 — with GitHub Actions Failure

konradkusiak97 had a problem deploying to WindowsCILock May 6, 2024 15:38 — with GitHub Actions Failure

konradkusiak97 added 2 commits May 6, 2024 18:40

Checking this

9aee1b2

merged sycl

65c7150

konradkusiak97 temporarily deployed to WindowsCILock May 6, 2024 17:44 — with GitHub Actions Inactive

konradkusiak97 temporarily deployed to WindowsCILock May 6, 2024 18:09 — with GitHub Actions Inactive

konradkusiak97 had a problem deploying to WindowsCILock May 6, 2024 19:01 — with GitHub Actions Error

Fixed lvl zero UR fetch

cb460b8

konradkusiak97 temporarily deployed to WindowsCILock May 6, 2024 19:20 — with GitHub Actions Inactive

konradkusiak97 temporarily deployed to WindowsCILock May 6, 2024 19:28 — with GitHub Actions Inactive

konradkusiak97 had a problem deploying to WindowsCILock May 6, 2024 20:41 — with GitHub Actions Failure

konradkusiak97 had a problem deploying to WindowsCILock May 6, 2024 20:50 — with GitHub Actions Failure

konradkusiak97 temporarily deployed to WindowsCILock May 7, 2024 08:50 — with GitHub Actions Inactive

konradkusiak97 temporarily deployed to WindowsCILock May 7, 2024 08:51 — with GitHub Actions Inactive

konradkusiak97 temporarily deployed to WindowsCILock May 7, 2024 13:04 — with GitHub Actions Inactive

konradkusiak97 temporarily deployed to WindowsCILock May 7, 2024 13:14 — with GitHub Actions Inactive

konradkusiak97 had a problem deploying to WindowsCILock May 7, 2024 13:37 — with GitHub Actions Error

konradkusiak97 had a problem deploying to WindowsCILock May 7, 2024 13:50 — with GitHub Actions Failure

Pointed UR to correct L0 branch with assert

666ab27

konradkusiak97 had a problem deploying to WindowsCILock May 7, 2024 13:53 — with GitHub Actions Failure

konradkusiak97 temporarily deployed to WindowsCILock May 7, 2024 14:10 — with GitHub Actions Inactive

konradkusiak97 had a problem deploying to WindowsCILock May 7, 2024 14:47 — with GitHub Actions Failure

konradkusiak97 had a problem deploying to WindowsCILock May 7, 2024 15:01 — with GitHub Actions Error

Changed UR fill to run synchronously

193fb32

konradkusiak97 temporarily deployed to WindowsCILock May 7, 2024 15:47 — with GitHub Actions Inactive

konradkusiak97 temporarily deployed to WindowsCILock May 7, 2024 16:12 — with GitHub Actions Inactive

konradkusiak97 had a problem deploying to WindowsCILock May 7, 2024 17:10 — with GitHub Actions Failure

konradkusiak97 temporarily deployed to WindowsCILock May 7, 2024 17:18 — with GitHub Actions Inactive

konradkusiak97 added 2 commits May 8, 2024 21:36

Merge branch 'sycl' into FixPostCommitFailure

03aa5a5

Moved UR tag back to place

4bc3183

konradkusiak97 temporarily deployed to WindowsCILock May 8, 2024 20:41 — with GitHub Actions Inactive

Added new test

b1b44e6

konradkusiak97 temporarily deployed to WindowsCILock May 8, 2024 20:54 — with GitHub Actions Inactive

konradkusiak97 temporarily deployed to WindowsCILock May 8, 2024 21:19 — with GitHub Actions Inactive

konradkusiak97 temporarily deployed to WindowsCILock May 8, 2024 21:53 — with GitHub Actions Inactive

konradkusiak97 temporarily deployed to WindowsCILock May 8, 2024 22:44 — with GitHub Actions Inactive

konradkusiak97 closed this May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCL] Fix post-commit failure #13657

[SYCL] Fix post-commit failure #13657

konradkusiak97 commented May 6, 2024

aelovikov-intel commented May 8, 2024

konradkusiak97 commented May 8, 2024

aelovikov-intel commented May 8, 2024

konradkusiak97 commented May 8, 2024

konradkusiak97 commented May 10, 2024 •

edited

Loading

aelovikov-intel commented May 13, 2024

konradkusiak97 commented May 13, 2024

[SYCL] Fix post-commit failure #13657

[SYCL] Fix post-commit failure #13657

Conversation

konradkusiak97 commented May 6, 2024

aelovikov-intel commented May 8, 2024

konradkusiak97 commented May 8, 2024

aelovikov-intel commented May 8, 2024

konradkusiak97 commented May 8, 2024

konradkusiak97 commented May 10, 2024 • edited Loading

aelovikov-intel commented May 13, 2024

konradkusiak97 commented May 13, 2024

konradkusiak97 commented May 10, 2024 •

edited

Loading