[Feature]: Better preprocessor macros to detect RDNA/CDNA family at compile time #59

benrichard-amd · 2024-04-08T20:50:09Z

Suggestion Description

As new instructions/features are added with each new arch, it is useful to know the target architecture at compile time to employ separate code paths. For example: FP64 MFMA was added in CDNA2, so CDNA2 and later can use one code path while CDNA1 uses a different code path.

It gets tedious because all the archs need to be enumerated, and code needs to be updated as new archs become available:

#if __gfx940__ || __gfx941__ || __gfx942__
// Code path for CDNA3
#elif __gfx90a__
// Code path for CNDA2
#elif __gfx908__
// Code path for CDNA1
#endif

It would be nice if we had something like:

#if CDNA_VERSION >= 3
// Code path for CDNA3 and later
#elif CDNA_VERSION >= 2
// Code path for CDNA2
#else
// Code path for CDNA1
#endif

This would mirror the way it is done in CUDA:

__device__ func()
{
#if __CUDA_ARCH__ >= 800
   // Code path for compute capability 8.x and later
#elif __CUDA_ARCH__ >= 700
   // Code path for compute capability 7.x
#else
  // Code path for compute capability < 7.0
#endif
}

Operating System

No response

GPU

No response

ROCm Component

No response

The text was updated successfully, but these errors were encountered:

If a processor belongs to CDNA generation, pre-define macro `__AMDGCN_CDNA_VERSION__` as an integer. Fixes: ROCm#59

yxsamliu · 2024-04-10T17:41:47Z

There are some concerns about introducing a macro for CDNA version.

Using #if __has_builtin may be a better way to determine whether a feature is available (https://clang.llvm.org/docs/LanguageExtensions.html#feature-checking-macros). It works for all GPUs, even for future generations.

mangupta transferred this issue from ROCm/HIP Apr 9, 2024

yxsamliu added a commit to yxsamliu/llvm-project that referenced this issue Apr 10, 2024

[AMDGPU] add macro __AMDGCN_CDNA_VERSION__

9016641

If a processor belongs to CDNA generation, pre-define macro `__AMDGCN_CDNA_VERSION__` as an integer. Fixes: ROCm#59

yxsamliu mentioned this issue Apr 10, 2024

[AMDGPU] add macro __AMDGCN_CDNA_VERSION__ llvm/llvm-project#88293

Closed

yxsamliu self-assigned this Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Better preprocessor macros to detect RDNA/CDNA family at compile time #59

[Feature]: Better preprocessor macros to detect RDNA/CDNA family at compile time #59

benrichard-amd commented Apr 8, 2024 •

edited

Loading

yxsamliu commented Apr 10, 2024

[Feature]: Better preprocessor macros to detect RDNA/CDNA family at compile time #59

[Feature]: Better preprocessor macros to detect RDNA/CDNA family at compile time #59

Comments

benrichard-amd commented Apr 8, 2024 • edited Loading

Suggestion Description

Operating System

GPU

ROCm Component

yxsamliu commented Apr 10, 2024

benrichard-amd commented Apr 8, 2024 •

edited

Loading