Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework setSIMDExtension to compile with partial SIMD enabled #455

Merged

Conversation

georges-arm
Copy link
Contributor

@georges-arm georges-arm commented Oct 28, 2024

There are existing options to enable both Arm native and SIMDe-based intrinsics implementations, however if only one of these is enabled then compilation for Arm targets currently fails.

Fix this by adjusting the existing #ifdef guards and adding new ones to cover the previously failing option combinations. Also stop checking REAL_TARGET_ARM since it does not distinguish between which SIMD targets are enabled.

Additionally the existing Neon code in RdCostARM.h depends on being able to call into the existing SIMDe-based x86 kernels. If VVENC_ENABLE_X86_SIMD is disabled in the CMake then these kernels are unavailable and compilation
fails. Until native Neon kernels are added for the missing functions, simply disable the optimised RdCost Neon code unless the SIMDe kernels are also available.

@K-os
Copy link
Contributor

K-os commented Oct 28, 2024

Thank you for attempting to clean up the compilation mess.

I just looked into it and realized that the ARM-SIMD code cannot work without having TARGET_SIMD_X86 enabled, because it explicitly calls some X86-SIMD implementations (xCalcHAD*_SSE for non-square block sizes). This is something that came in after the first NEON implementations, and I did not know this when I said it was supposed to work in my comment on #453. I checked back with colleagues and we can define that TARGET_SIMD_X86 is a requirement for TARGET_SIMD_ARM.

So, even with your changes applied, the code does not compile when TARGET_SIMD_ARM is enabled, but TARGET_SIMD_X86 is not. Maybe we should explicitly catch that in the configure step.

@georges-arm
Copy link
Contributor Author

Thank you for attempting to clean up the compilation mess.

I just looked into it and realized that the ARM-SIMD code cannot work without having TARGET_SIMD_X86 enabled, because it explicitly calls some X86-SIMD implementations (xCalcHAD*_SSE for non-square block sizes). This is something that came in after the first NEON implementations, and I did not know this when I said it was supposed to work in my comment on #453. I checked back with colleagues and we can define that TARGET_SIMD_X86 is a requirement for TARGET_SIMD_ARM.

So, even with your changes applied, the code does not compile when TARGET_SIMD_ARM is enabled, but TARGET_SIMD_X86 is not. Maybe we should explicitly catch that in the configure step.

Thanks for pointing out the problems with the xCalcHAD functions, I now realise that CMake cache variables meant that my existing build directories wasn't actually picking up my attempts to recompile with the SIMDe kernels disabled.

I think it is still possible to avoid the dependency between the two options: we can simply disable the RdCost optimisations if TARGET_SIMD_X86 is not enabled. This is similar to how we already require AArch64 for these kernels to be used. I've put up a patch to that effect (and swapped the existing __ARM_ARCH >= 8 check for REAL_TARGET_AARCH64 now that we have it). Let me know what you think about this approach!

I also have a patch to remove the AArch64-only requirement for the RdCost kernels such that it also compiles on 32-bit Arm platforms, but I will put that up as a separate PR once this is merged so it doesn't conflict.

Copy link
Contributor

@K-os K-os left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, looks good.

There are existing options to enable both Arm native and SIMDe-based
intrinsics implementations, however if only one of these is enabled then
compilation for Arm targets currently fails.

Fix this by adjusting the existing #ifdef guards and adding new ones to
cover the previously failing option combinations. Also stop checking
REAL_TARGET_ARM since it does not distinguish between which SIMD targets
are enabled.
The existing Neon code in RdCostARM.h depends on being able to call into
the existing SIMDe-based x86 kernels. If VVENC_ENABLE_X86_SIMD is
disabled in the CMake then these kernels are unavailable and compilation
fails.

Until native Neon kernels are added for the missing functions, simply
disable the optimised RdCost Neon code unless the SIMDe kernels are also
available.
@adamjw24 adamjw24 merged commit f26d463 into fraunhoferhhi:master Oct 29, 2024
8 checks passed
@georges-arm georges-arm deleted the geoste01/arm-mixed-simd-fixes branch October 29, 2024 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants