Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KHMBB16 #131

Open
iollmann opened this issue Feb 10, 2022 · 0 comments
Open

KHMBB16 #131

iollmann opened this issue Feb 10, 2022 · 0 comments

Comments

@iollmann
Copy link

iollmann commented Feb 10, 2022

Similar instructions in other vector ISAs, SQRDMULH, PMULHRSW, and vmhraddshs, all round their products by adding 2**14 before right shifting. It would be helpful if this design did similar things to better enable software portability. It is typically one of the better performing multipliers out there.

As a bit of a history note, these things are used for fixed point multiplication by [-1,1), which is exactly what you want to do in DCTs (JPEG) and other DFT like algorithms. It can also be helpful in some image blend modes/filters, particularly those involving luminance calculation, Lanczos resampling, and I can only imagine that there are audio applications as well, given prevalence of FFT/mDCT in that space. The rounding helps reduce accumulated error and allows the codec to better conform to behavior on other platforms. Without it, we may expect some modest darkening of the image / quieting of the sound, or to be forced off to use some other instruction that runs at half multiplication throughput in order to provide comparable results to other platforms. It also provides for more symmetric rounding. Otherwise the right shift rounds towards -Inf, and is asymmetric about 0, not good for sinusoids.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant