Reducing compile time of JAX HEALPix (I)FFT implementations #171

matt-graham · 2023-12-01T10:36:24Z

Related to #140 though this doesn't completely remove the loops in the JAX HEALPix FFT and IFFT implementations, but it does reduce the number of unrolled operations and so compile time. Unfortunately the optimizations do make the code a bit less readable and less directly tied to the NumPy implementations.

I've tested locally against the tests added in #170 which pass, but we would probably want to merge that in first, so I'm marking this as draft until that is merged in and we can then rebase on top of that.

Compared to previous implementations, this tries to vectorize operations as much as possible by processing data connected to $\theta$ rings of the same size (all equatorial rings and the pairs of polar rings of equal sizes) together.

The big gain is in vectorizing the operations on all the equally sized equatorial bands together, as this removes around 2 * nside unrolled loop iterations in favour of one set of vectorized operations. Processing the pairs of polar rings together gives a smaller but still helpful reduction in compile time.

codecov · 2023-12-01T10:45:56Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (4ef9c67) 91.63% compared to head (84fbf0b) 91.64%.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #171   +/-   ##
=======================================
  Coverage   91.63%   91.64%           
=======================================
  Files          22       22           
  Lines        2510     2512    +2     
=======================================
+ Hits         2300     2302    +2     
  Misses        210      210

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

CosmoMatt

Great @matt-graham this should help for now. As you say, this approach still has the same scaling due to the outer loops but should be at least a factor of 2 less compile time, likely a little better.

Reducing compile time of JAX HEALPix (I)FFT implementations

84fbf0b

matt-graham force-pushed the mmg/healpix-fft-compile-optimizations branch from 9f43f3b to 84fbf0b Compare December 4, 2023 14:45

matt-graham marked this pull request as ready for review December 4, 2023 14:45

matt-graham requested a review from CosmoMatt December 4, 2023 14:45

CosmoMatt approved these changes Dec 4, 2023

View reviewed changes

jasonmcewen approved these changes Dec 4, 2023

View reviewed changes

matt-graham merged commit aff7f27 into main Dec 4, 2023
3 checks passed

matt-graham deleted the mmg/healpix-fft-compile-optimizations branch December 4, 2023 16:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reducing compile time of JAX HEALPix (I)FFT implementations #171

Reducing compile time of JAX HEALPix (I)FFT implementations #171

matt-graham commented Dec 1, 2023

codecov bot commented Dec 1, 2023 •

edited

Loading

CosmoMatt left a comment

Reducing compile time of JAX HEALPix (I)FFT implementations #171

Reducing compile time of JAX HEALPix (I)FFT implementations #171

Conversation

matt-graham commented Dec 1, 2023

codecov bot commented Dec 1, 2023 • edited Loading

Codecov Report

CosmoMatt left a comment

Choose a reason for hiding this comment

codecov bot commented Dec 1, 2023 •

edited

Loading