Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Better preprocessor macros to detect RDNA/CDNA family at compile time #59

Open
benrichard-amd opened this issue Apr 8, 2024 · 1 comment
Assignees

Comments

@benrichard-amd
Copy link

benrichard-amd commented Apr 8, 2024

Suggestion Description

As new instructions/features are added with each new arch, it is useful to know the target architecture at compile time to employ separate code paths. For example: FP64 MFMA was added in CDNA2, so CDNA2 and later can use one code path while CDNA1 uses a different code path.

It gets tedious because all the archs need to be enumerated, and code needs to be updated as new archs become available:

#if __gfx940__ || __gfx941__ || __gfx942__
// Code path for CDNA3
#elif __gfx90a__
// Code path for CNDA2
#elif __gfx908__
// Code path for CDNA1
#endif

It would be nice if we had something like:

#if CDNA_VERSION >= 3
// Code path for CDNA3 and later
#elif CDNA_VERSION >= 2
// Code path for CDNA2
#else
// Code path for CDNA1
#endif

This would mirror the way it is done in CUDA:

__device__ func()
{
#if __CUDA_ARCH__ >= 800
   // Code path for compute capability 8.x and later
#elif __CUDA_ARCH__ >= 700
   // Code path for compute capability 7.x
#else
  // Code path for compute capability < 7.0
#endif
}

Operating System

No response

GPU

No response

ROCm Component

No response

@mangupta mangupta transferred this issue from ROCm/HIP Apr 9, 2024
yxsamliu added a commit to yxsamliu/llvm-project that referenced this issue Apr 10, 2024
If a processor belongs to CDNA generation, pre-define macro
`__AMDGCN_CDNA_VERSION__` as an integer.

Fixes: ROCm#59
@yxsamliu yxsamliu self-assigned this Apr 10, 2024
@yxsamliu
Copy link

There are some concerns about introducing a macro for CDNA version.

Using #if __has_builtin may be a better way to determine whether a feature is available (https://clang.llvm.org/docs/LanguageExtensions.html#feature-checking-macros). It works for all GPUs, even for future generations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants