Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

riscv: optimise memcpy when misalign #71

Open
wants to merge 2 commits into
base: lpi4a
Choose a base branch
from
Open

Conversation

ixgbe01
Copy link

@ixgbe01 ixgbe01 commented Jan 15, 2024

The current behaviour of memcpy is that it checks if both src and dest pointers are co-aligned . If aligned, it will copy data word-by-word after first aligning pointers to word boundary. If src and dst are not co-aligned, however, byte-wise copy will be performed.

This patch optimises memcpy for misaligned cases. It will first align destination pointer to word-boundary regardless whether src and dest are co-aligned or not. If they indeed are, then wordwise copy is performed. If they are not co-aligned, then it will load two adjacent words from src and use shifts to assemble a full machine word.

In my testing this speeds up memcpy 4~5x when src and dest are not co-aligned.

…s. It will first align destination pointer to word-boundary regardless whether src and dest are co-aligned or not. If they indeed are, then wordwise copy is performed. If they are not co-aligned, then it will load two adjacent words from src and use shifts to assemble a full machine word.
@ixgbe01
Copy link
Author

ixgbe01 commented Jan 15, 2024

How to test?

memcpy.S link: https://github.com/revyos/thead-kernel/blob/lpi4a/arch/riscv/lib/memcpy.S
original memcpy.S & modifyed memcpy.S source code
memcpyS.zip

memcpy.c
https://github.com/ARM-software/optimized-routines/blob/master/string/bench/memcpy.c

header file
https://github.com/ARM-software/optimized-routines/blob/master/string/include/benchlib.h
https://github.com/ARM-software/optimized-routines/blob/master/string/include/stringlib.h

mkdir memcpy_test
mv memcpy.S memcpy.c benchlib.h stringlib.h memcpy_test

cd memcpy_test
gcc memcpy.c memcpy.S -static -O0

original memcpy.S
29f7068cc51466ac788ea14ac5db68f

modifyed memcpy.S
6518c344f2a6eb2b4297c5b4dfdd776

@ixgbe01
Copy link
Author

ixgbe01 commented Jan 16, 2024

riscv: __asm_copy_to-from_user: Optimize unaligned memory access
refer to: torvalds/linux@ca6eaaa?diff=split

@Icenowy
Copy link
Contributor

Icenowy commented Jan 18, 2024

Why not keep the original commit message?

@ixgbe01
Copy link
Author

ixgbe01 commented Jan 18, 2024

Why not keep the original commit message?
I'm not very familiar with the rules for submitting code here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants