Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Refactor AArch64 Interpolation Filter 16x16 implementation (#431)
* Move InterpolationFilter{ARM.h => _neon.cpp} Since this header is only used in one place and would not share any code with an eventual SVE implementation, simply move it to a .cpp file similar to MCTF.cpp. * Refactor simdFilter16xX_N8_neon The use of the vsrcv temporary array rather than simple local variables meant that LLVM emitted an unnecessary number of load/store instructions in the inner loops. Refactoring this to make the dependency between loop iterations more explicit allows for much nicer generated code. Running a video encoding job on a Neoverse V2 machine using the --preset=fast setting shows a ~1.8% improvement in reported FPS.
- Loading branch information