Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ch4/shm: fix performance degradation on Sapphire Rapids with Intel Compiler #7150

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Commits on Sep 24, 2024

  1. ch4/posix: making topology aware SHM default to enabled

    Fix the performance degradation on Intel Sapphire Rapids after
    introducing topo-aware SHM. This problem only happens when building
    with Intel compiler. The problem was topo-aware default
    to disabled. It uses regular memcpy for inter-NUMA message which
    is different from v4.2.2 (uses non-temporal copy).
    
    The reason this is disabled by default was due to using non-temporal
    copy results in higher latency in small message. After more testing
    with different CPUs (broadwell, skylake, cascade, icelake, milan),
    It seems only skylake, cascade and icelake has this issue on small
    message. It is probably OK to make topo-aware SHM default to enabled.
    yfguo committed Sep 24, 2024
    Configuration menu
    Copy the full SHA
    66480d3 View commit details
    Browse the repository at this point in the history
  2. configure: re-enable SSE2 and AVX optimization options for MPICH

    Previous PR#7074 consolidated SSE2 and AVX related optimization
    options into MPL's configure because only MPL explicitly use them.
    This change showed no performance degradation with GNU compiler.
    But, with Intel compilers, this does results in some performance
    degradation. Therefore, we should add them back in the main
    configure. Currently, the main configure checks for availability
    of SSE2, AVX and AVX512F, and add them to CFLAGS. The MPL configure
    will further check for specific instructions that is used in MPL.
    yfguo committed Sep 24, 2024
    Configuration menu
    Copy the full SHA
    e0710c7 View commit details
    Browse the repository at this point in the history