Skip to content

Commit

Permalink
Refactor AArch64 Interpolation Filter 16x16 implementation (#431)
Browse files Browse the repository at this point in the history
* Move InterpolationFilter{ARM.h => _neon.cpp}

Since this header is only used in one place and would not share any code
with an eventual SVE implementation, simply move it to a .cpp file
similar to MCTF.cpp.

* Refactor simdFilter16xX_N8_neon

The use of the vsrcv temporary array rather than simple local variables
meant that LLVM emitted an unnecessary number of load/store instructions
in the inner loops. Refactoring this to make the dependency between loop
iterations more explicit allows for much nicer generated code.

Running a video encoding job on a Neoverse V2 machine using the
--preset=fast setting shows a ~1.8% improvement in reported FPS.
  • Loading branch information
georges-arm authored Oct 18, 2024
1 parent 7acfaba commit 2f25b8a
Show file tree
Hide file tree
Showing 2 changed files with 287 additions and 300 deletions.
299 changes: 0 additions & 299 deletions source/Lib/CommonLib/arm/InterpolationFilterARM.h

This file was deleted.

Loading

0 comments on commit 2f25b8a

Please sign in to comment.