Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] CI: debug Clang thread sanitizer errors #5492

Open
wants to merge 5 commits into
base: development
Choose a base branch
from

Conversation

EZoni
Copy link
Member

@EZoni EZoni commented Dec 3, 2024

Debug data race conditions raised by the Clang thread sanitizer CI job that was disabled in #5474.

@EZoni EZoni added the component: tests Tests and CI label Dec 3, 2024
@EZoni
Copy link
Member Author

EZoni commented Dec 4, 2024

Just copying one piece of information from the Clang documentation:

ThreadSanitizer is in beta stage. It is known to work on large C++ programs using pthreads, but we do not promise anything (yet). C++11 threading is supported with llvm libc++. The test suite is integrated into CMake build and can be run with make check-tsan command.

@EZoni
Copy link
Member Author

EZoni commented Dec 5, 2024

This is the summary of the race condition raised by the sanitizer, seemingly referring to a race condition on the AMReX end (line 105 of Src/Base/AMReX_Random.cpp):

SUMMARY: ThreadSanitizer: data race /home/runner/work/WarpX/WarpX/build/_deps/fetchedamrex-src/Src/Base/AMReX_Random.cpp:105:27 in amrex::InitRandom(unsigned long, int, unsigned long) (.omp_outlined_debug__)

Full log preceeding that summary message:

WARNING: ThreadSanitizer: data race (pid=9014)
  Read of size 8 at 0x7ffe53cae3e8 by thread T3:
    #0 amrex::InitRandom(unsigned long, int, unsigned long) (.omp_outlined_debug__) /home/runner/work/WarpX/WarpX/build/_deps/fetchedamrex-src/Src/Base/AMReX_Random.cpp:105:27 (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x82877a) (BuildId: 559e42540025a3c6ccba75684192ae77b83a3c08)
    #1 amrex::InitRandom(unsigned long, int, unsigned long) (.omp_outlined) /home/runner/work/WarpX/WarpX/build/_deps/fetchedamrex-src/Src/Base/AMReX_Random.cpp:101:1 (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x82877a)
    #2 __kmp_invoke_microtask <null> (libomp.so.5+0xdddc2) (BuildId: 5bc060b5a52d9[35](https://github.com/ECP-WarpX/WarpX/actions/runs/12150575793/job/33883599265#step:6:36)eaa6f063c7bb0d9bd35e11f99)

  Previous write of size 8 at 0x7ffe53cae3e8 by main thread:
    #0 amrex::InitRandom(unsigned long, int, unsigned long) /home/runner/work/WarpX/WarpX/build/_deps/fetchedamrex-src/Src/Base/AMReX_Random.cpp (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x828641) (BuildId: 559e42540025a3c6ccba75684192ae77b83a3c08)
    #1 amrex::Initialize(int&, char**&, bool, ompi_communicator_t*, std::function<void ()> const&, std::ostream&, std::ostream&, void (*)(char const*)) /home/runner/work/WarpX/WarpX/build/_deps/fetchedamrex-src/Src/Base/AMReX.cpp:647:5 (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x7eebe9) (BuildId: 559e42540025a3c6ccba75684192ae77b83a3c08)
    #2 warpx::initialization::amrex_init(int&, char**&, bool) /home/runner/work/WarpX/WarpX/Source/Initialization/WarpXAMReXInit.cpp:116:16 (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x67096e) (BuildId: 559e42540025a3c6ccba75684192ae77b83a3c08)
    #3 warpx::initialization::initialize_external_libraries(int, char**) /home/runner/work/WarpX/WarpX/Source/Initialization/WarpXInit.cpp:20:5 (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x2d696f) (BuildId: 559e42540025a3c6ccba75684192ae77b83a3c08)
    #4 main /home/runner/work/WarpX/WarpX/Source/main.cpp:20:5 (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x1[39](https://github.com/ECP-WarpX/WarpX/actions/runs/12150575793/job/33883599265#step:6:40)6da) (BuildId: 559e42540025a3c6ccba75684192ae77b83a3c08)
    #0 amrex::InitRandom(unsigned long, int, unsigned long) (.omp_outlined_debug__) /home/runner/work/WarpX/WarpX/build/_deps/fetchedamrex-src/Src/Base/AMReX_Random.cpp:105:27 (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x82877a) (BuildId: 559e425[40](https://github.com/ECP-WarpX/WarpX/actions/runs/12150575793/job/33883599265#step:6:41)025a3c6ccba75684192ae77b83a3c08)
    #1 amrex::InitRandom(unsigned long, int, unsigned long) (.omp_outlined) /home/runner/work/WarpX/WarpX/build/_deps/fetchedamrex-src/Src/Base/AMReX_Random.cpp:101:1 (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x82877a)
    #2 __kmp_invoke_microtask <null> (libomp.so.5+0xdddc2) (BuildId: 5bc060b5a52d935eaa6f063c7bb0d9bd35e11f99)

  Previous write of size 8 at 0x7ffdff075bd8 by main thread:
    #0 amrex::InitRandom(unsigned long, int, unsigned long) /home/runner/work/WarpX/WarpX/build/_deps/fetchedamrex-src/Src/Base/AMReX_Random.cpp (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x8286[41](https://github.com/ECP-WarpX/WarpX/actions/runs/12150575793/job/33883599265#step:6:42)) (BuildId: 559e42540025a3c6ccba75684192ae77b83a3c08)
    #1 amrex::Initialize(int&, char**&, bool, ompi_communicator_t*, std::function<void ()> const&, std::ostream&, std::ostream&, void (*)(char const*)) /home/runner/work/WarpX/WarpX/build/_deps/fetchedamrex-src/Src/Base/AMReX.cpp:647:5 (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x7eebe9) (BuildId: 559e[42](https://github.com/ECP-WarpX/WarpX/actions/runs/12150575793/job/33883599265#step:6:43)540025a3c6ccba75684192ae77b83a3c08)
    #2 warpx::initialization::amrex_init(int&, char**&, bool) /home/runner/work/WarpX/WarpX/Source/Initialization/WarpXAMReXInit.cpp:116:16 (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x67096e) (BuildId: 559e42540025a3c6ccba75684192ae77b83a3c08)
    #3 warpx::initialization::initialize_external_libraries(int, char**) /home/runner/work/WarpX/WarpX/Source/Initialization/WarpXInit.cpp:20:5 (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x2d696f) (BuildId: 559e42540025a3c6ccba75684192ae77b83a3c08)
    #4 main /home/runner/work/WarpX/WarpX/Source/main.cpp:20:5 (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x1396da) (BuildId: 559e42540025a3c6ccba75684192ae77b83a3c08)


  Location is stack of main thread.

  Location is global '??' at 0x7ffe53c90000 ([stack]+0x1e3e8)

  Thread T3 (tid=9024, running) created by main thread at:
  Location is stack of main thread.

  Location is global '??' at 0x7ffdff057000 ([stack]+0x1ebd8)

  Thread T3 (tid=9023, running) created by main thread at:
    #0 pthread_create <null> (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0xaf52f) (BuildId: 559e42540025a3c6ccba75684192ae77b83a3c08)
    #0 pthread_create <null> (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0xaf52f) (BuildId: 559e42540025a3c6ccba75684192ae77b83a3c08)
    #1 <null> <null> (libomp.so.5+0xb587a) (BuildId: 5bc060b5a52d935eaa6f063c7bb0d9bd35e11f99)
    #1 <null> <null> (libomp.so.5+0xb587a) (BuildId: 5bc060b5a52d935eaa6f063c7bb0d9bd35e11f99)
    #2 amrex::Initialize(int&, char**&, bool, ompi_communicator_t*, std::function<void ()> const&, std::ostream&, std::ostream&, void (*)(char const*)) /home/runner/work/WarpX/WarpX/build/_deps/fetchedamrex-src/Src/Base/AMReX.cpp:642:5 (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x7eeba2) (BuildId: 559e42540025a3c6ccba75684192ae77b83a3c08)
    #2 amrex::Initialize(int&, char**&, bool, ompi_communicator_t*, std::function<void ()> const&, std::ostream&, std::ostream&, void (*)(char const*)) /home/runner/work/WarpX/WarpX/build/_deps/fetchedamrex-src/Src/Base/AMReX.cpp:642:5 (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x7eeba2) (BuildId: 559e42540025a3c6ccba75684192ae77b83a3c08)
    #3 warpx::initialization::amrex_init(int&, char**&, bool) /home/runner/work/WarpX/WarpX/Source/Initialization/WarpXAMReXInit.cpp:116:16 (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x67096e) (BuildId: 559e42540025a3c6ccba75684192ae77b83a3c08)
    #3 warpx::initialization::amrex_init(int&, char**&, bool) /home/runner/work/WarpX/WarpX/Source/Initialization/WarpXAMReXInit.cpp:116:16 (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x67096e) (BuildId: 559e42[54](https://github.com/ECP-WarpX/WarpX/actions/runs/12150575793/job/33883599265#step:6:55)0025a3c6ccba75684192ae77b83a3c08)
    #4 warpx::initialization::initialize_external_libraries(int, char**) /home/runner/work/WarpX/WarpX/Source/Initialization/WarpXInit.cpp:20:5 (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x2d696f) (BuildId: [55](https://github.com/ECP-WarpX/WarpX/actions/runs/12150575793/job/33883599265#step:6:56)9e42540025a3c6ccba7[56](https://github.com/ECP-WarpX/WarpX/actions/runs/12150575793/job/33883599265#step:6:57)84192ae77b83a3c08)
    #4 warpx::initialization::initialize_external_libraries(int, char**) /home/runner/work/WarpX/WarpX/Source/Initialization/WarpXInit.cpp:20:5 (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x2d696f) (BuildId: 5[59](https://github.com/ECP-WarpX/WarpX/actions/runs/12150575793/job/33883599265#step:6:60)e42540025a3c6ccba75684192ae77b83a3c08)
    #5 main /home/runner/work/WarpX/WarpX/Source/main.cpp:20:5 (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x1396da) (BuildId: 559e42540025a3c6ccba75684192ae77b83a3c08)

    #5 main /home/runner/work/WarpX/WarpX/Source/main.cpp:20:5 (warpx.1d.MPI.OMP.DP.PDP.OPMD.FFT.QED.GENQEDTABLES+0x1396da) (BuildId: 559e42540025a3c6ccba75684192ae77b83a3c08)

@EZoni EZoni force-pushed the ci_clang_thread_sanitizer branch from c46c5fd to 93c6d8e Compare December 6, 2024 21:59
@EZoni
Copy link
Member Author

EZoni commented Dec 6, 2024

@atmyers @WeiqunZhang

I tried running with clang-18 and clang-19 (the one used since the latest commit a49c934, after installing directly from LLVM), but I keep seeing the data race condition in both cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: tests Tests and CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant