rocFFT and hipFFT examples (part I) #141

Beanavil · 2024-07-16T09:56:26Z

This pull request contains the first batch of the new rocFFT and hipFFT examples. Added samples:

rocFFT

callback
multi_gpu

hipFFT

plan
- plan_d2z
- plan_z2z

malcolmroberts · 2024-08-02T21:06:58Z

Libraries/hipFFT/plan_d2z/CMakeLists.txt

+        install(IMPORTED_RUNTIME_ARTIFACTS roc::rocfft)
+    elseif(GPU_RUNTIME STREQUAL "CUDA")
+        find_package(CUDAToolkit REQUIRED)
+        install(IMPORTED_RUNTIME_ARTIFACTS CUDA::cusolver)


Pretty sure you're looking for cufft here?

Yep, fixed this!

malcolmroberts · 2024-08-02T21:10:55Z

Is there any CI about this compiling and the examples actually running?

evetsso · 2024-08-02T23:36:44Z

What CI is there for these examples?

Beanavil · 2024-08-05T07:57:07Z

Hi @malcolmroberts @evetsso, AFAIK the only CI in place for the examples in GitHub is the one for linting (.github/workflows/linting.yml). Additionally, we (StreamHPC) have our own internal CI, where we build and test the examples. I think, but I'm not 100% sure, that there is also an internal CI on AMD's side, perhaps @dgaliffiAMD can provide more details.

dgaliffiAMD · 2024-08-05T11:49:37Z

Hi @malcolmroberts ,

It is as @Beanavil says, the external CIs are just for linters. The basic GitHub runners couldn't complete the build; they ran out of disk space trying to install ROCm.
Internally, changes are in code review to add the rocm-examples to the build and test pipelines.
There are efforts to have the entire ROCm stack built through an Azure pipeline. This CI will be available externally and when it is ready rocm-examples will be included.

Thanks,
David

evetsso · 2024-08-06T15:21:21Z

Libraries/hipFFT/plan_z2z/CMakeLists.txt

+        install(IMPORTED_RUNTIME_ARTIFACTS roc::rocfft)
+    elseif(GPU_RUNTIME STREQUAL "CUDA")
+        find_package(CUDAToolkit REQUIRED)
+        install(IMPORTED_RUNTIME_ARTIFACTS CUDA::cusolver)


Same thing that Malcolm pointed out - this should be cufft.

evetsso · 2024-08-06T15:43:02Z

Libraries/rocFFT/callback/main.cpp

+    // Prepare callback
+    load_callback_data callback_data_host;
+    callback_data_host.filter = callback_filter_dev;
+    callback_data_host.scale  = 1.0 / static_cast<double>(N);


rocFFT has an explicit API to perform result scaling: https://rocm.docs.amd.com/projects/rocFFT/en/latest/how-to/working-with-rocfft.html#result-scaling

hipFFT also exposes an API to do this.

This API is expected to perform better than callbacks, though I realize that the rocFFT repository does not currently have an example to demonstrate its usage. The explicit API is quicker because the compiler is able to understand and optimize the extra scaling multiplication - with callback functions the runtime only receives an opaque function pointer and the compiler cannot optimize the code as well.

AFAICT result scaling is the most common use case for callback functions in the API. I think it would make sense to have an example demonstrating use of the explicit result scaling API, and then this callback example can be repurposed for a different operation.

I don't know which operation would be best though - aside from result scaling, I'm unaware of a commonly-used operation that people would want to do in a callback function.

Oh I see, we hadn't noticed this functionality before. Perhaps, given that the callback applies a filter and a scaling factor, we can keep the filtering and just use the scaling-specific API you mentioned for the scaling part. I think that still would make sense as a use-case for callbacks, and then we also get to showcase roc/hipFFT's scaling API.

I guess that's reasonable, since rocFFT has no API for filtering currently. The only issue is that we wouldn't really see the performance benefit of using the result scaling API, since callbacks are still in use. We hope to have a better way to express filtering (and other spectral operations) eventually, but it'll take a while to get there.

Forgot to answer here, the example should be updated already!

evetsso

Looks OK to me.

One general question I have about these samples - sometimes, the main source code file is called main.hip, and sometimes it's main.cpp. Is there any system behind this choice?

It seems to me that a source file that contains HIP code (i.e. __global__ or __device__) can be justified to be named .hip. Otherwise, .cpp makes more sense. But it doesn't really matter in the end since we're using set_source_file_properties in all the CMakeLists.txt anyway to override the language.

Out of all these examples, really only the callback one contains any HIP code.

Snektron · 2024-08-12T07:43:25Z

It seems to me that a source file that contains HIP code (i.e. global or device) can be justified to be named .hip. Otherwise, .cpp makes more sense.

If it has to be compiled as HIP code, because it contains device code, then the extension is .hip. If it can be compiled using a regular c++ compiler (at least in theory I guess, since you pointed out the language overriding thing), then the file extenion is c++. I'm fairly sure that this is documented somewhere, but I couldn't find it now...

dgaliffiAMD · 2024-08-12T15:15:11Z

Hi @evetsso and @Snektron, it looks like everything is approved, but I still see a couple of unresolved conversations. I just wanted to couple check that this PR is okay to merge. Thank you.

Beanavil · 2024-08-22T13:12:05Z

@evetsso sorry for the late answer, I was off these past few days. I hadn't noticed the extensions mixup, it should be fixed in the latest commits.

BTW: looks like the Azure pipeline is failing, but the logs show that the errors are caused by CMake not finding some of the HIP libraries, which I guess is an issue with the pipeline setup (?). On our end the build&test is successful, so I think there should be no problem merging this

danielsu-amd · 2024-08-22T14:18:36Z

Hi @Beanavil, yes the Azure pipeline is missing some ROCm components that are required by this PR, I will be adding those into the CI.

You can view a successful build (with all dependencies included) of this PR here: https://dev.azure.com/ROCm-CI/ROCm-CI/_build/results?buildId=6032&view=results

Beanavil requested a review from malcolmroberts July 16, 2024 09:56

Beanavil requested review from a team and dgaliffiAMD as code owners July 16, 2024 09:56

Beanavil force-pushed the fft-callback-plan-multigpu branch 2 times, most recently from 2be7d9c to 74845d4 Compare July 16, 2024 10:06

Beanavil force-pushed the fft-callback-plan-multigpu branch from 74845d4 to 86de806 Compare July 30, 2024 07:41

NB4444 and others added 7 commits July 30, 2024 07:48

Resolve "rocFFT callback Example"

5c5eebc

feat: add hipFFT plan examples

d8e58ea

Resolve "rocFFT multi_gpu Example"

fab8f00

Fixed CMake linting

a224fa0

Added rocFFT callback and multi_gpu VS files

73ddf8a

Resolve "Generate VS files from external meta-data repository"

d4bebeb

Fixed Markdown linting

9fb4ec8

Beanavil force-pushed the fft-callback-plan-multigpu branch from 86de806 to 9fb4ec8 Compare July 30, 2024 07:48

malcolmroberts reviewed Aug 2, 2024

View reviewed changes

malcolmroberts requested review from af-ayala, Kevonosdiaz, eng-flavio-teixeira and evetsso August 2, 2024 21:16

evetsso reviewed Aug 6, 2024

View reviewed changes

Fixed installed target in hipFFT plan examples

ace561a

malcolmroberts requested a review from amd-jmacaran August 7, 2024 16:53

Added explicit result scaling

f49412e

evetsso self-requested a review August 9, 2024 15:23

evetsso approved these changes Aug 9, 2024

View reviewed changes

Beanavil added 2 commits August 21, 2024 07:37

Renamed hipFFT's main programs to use C++ extension

ba9d7a7

Renamed rocFFT/callback's main program to use HIP extension

980bc73

danielsu-amd mentioned this pull request Aug 22, 2024

External CI: add hipBLASLt + roc/hipFFT to rocm-examples ROCm/ROCm#3634

Merged

dgaliffiAMD merged commit b6e6ecc into ROCm:develop Aug 26, 2024
5 of 8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rocFFT and hipFFT examples (part I) #141

rocFFT and hipFFT examples (part I) #141

Beanavil commented Jul 16, 2024 •

edited

Loading

malcolmroberts Aug 2, 2024

Beanavil Aug 7, 2024

malcolmroberts commented Aug 2, 2024

evetsso commented Aug 2, 2024

Beanavil commented Aug 5, 2024

dgaliffiAMD commented Aug 5, 2024

evetsso Aug 6, 2024

Beanavil Aug 7, 2024

evetsso Aug 6, 2024 •

edited

Loading

Beanavil Aug 7, 2024

evetsso Aug 7, 2024

Beanavil Aug 9, 2024

evetsso left a comment

Snektron commented Aug 12, 2024

dgaliffiAMD commented Aug 12, 2024

Beanavil commented Aug 22, 2024 •

edited

Loading

danielsu-amd commented Aug 22, 2024

rocFFT and hipFFT examples (part I) #141

rocFFT and hipFFT examples (part I) #141

Conversation

Beanavil commented Jul 16, 2024 • edited Loading

rocFFT

hipFFT

malcolmroberts Aug 2, 2024

Choose a reason for hiding this comment

Beanavil Aug 7, 2024

Choose a reason for hiding this comment

malcolmroberts commented Aug 2, 2024

evetsso commented Aug 2, 2024

Beanavil commented Aug 5, 2024

dgaliffiAMD commented Aug 5, 2024

evetsso Aug 6, 2024

Choose a reason for hiding this comment

Beanavil Aug 7, 2024

Choose a reason for hiding this comment

evetsso Aug 6, 2024 • edited Loading

Choose a reason for hiding this comment

Beanavil Aug 7, 2024

Choose a reason for hiding this comment

evetsso Aug 7, 2024

Choose a reason for hiding this comment

Beanavil Aug 9, 2024

Choose a reason for hiding this comment

evetsso left a comment

Choose a reason for hiding this comment

Snektron commented Aug 12, 2024

dgaliffiAMD commented Aug 12, 2024

Beanavil commented Aug 22, 2024 • edited Loading

danielsu-amd commented Aug 22, 2024

Beanavil commented Jul 16, 2024 •

edited

Loading

evetsso Aug 6, 2024 •

edited

Loading

Beanavil commented Aug 22, 2024 •

edited

Loading