Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve memory read/write #123

Merged
merged 2 commits into from
Apr 11, 2023
Merged

Conversation

qwe661234
Copy link
Collaborator

@qwe661234 qwe661234 commented Apr 7, 2023

After analyzing the data collected during the execution of the Dhrystone benchmark, we found that the primary performance bottleneck lies in memory read/write operations. To address this issue, we modified the implementation of memory read/write by eliminating unnecessary checks and replacing slow operations with more efficient ones. Additionally, we integrated a simple memory pool and resolved the issue of disabled unaligned memory access, the user can add argument --misalign to enable misaligned memory access.

After analyzing the performance results from running Dhrystone, we observed a significant improvement following our enhancements to the memory I/O.

Test 66200d0 improvement Speedup
Dhrystone 815 DMIPS 1245 DMIPS +52.7%

See: #122

src/riscv_private.h Outdated Show resolved Hide resolved
Copy link
Contributor

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the misaligned memory access is optional, we don't have to check it for the internals of emulator. Instead, we can register the callback functions while initializing the emulator.

See async.h and async.c for details.

/* API gateway */
struct __ASYNC_API__ Async = {
    .create = async_create,
    .signal = async_signal,
    .wait = async_wait,
    .finish = async_finish,
    .run = async_run,
};

Once you prepare the functions specialized for both misaligned memory and strictly-aligned memory access, the proper I/O implementations can be hooked up. Consequently, you don't have to check allow_ misalign all the time.

src/io.c Outdated Show resolved Hide resolved
src/main.c Outdated Show resolved Hide resolved
@jserv
Copy link
Contributor

jserv commented Apr 7, 2023

Test 66200d0 Proposed Speedup
Dhrystone 815 DMIPS 1200 DMIPS +47.2%

What else do you consider it would be beneficial to employ in place of Spike for faster interpreting?

@qwe661234 qwe661234 closed this Apr 7, 2023
@sysprog21 sysprog21 deleted a comment from qwe661234 Apr 7, 2023
@qwe661234 qwe661234 reopened this Apr 7, 2023
@qwe661234
Copy link
Collaborator Author

qwe661234 commented Apr 7, 2023

Since the misaligned memory access is optional, we don't have to check it for the internals of emulator. Instead, we can register the callback functions while initializing the emulator.

See async.h and async.c for details.

/* API gateway */
struct __ASYNC_API__ Async = {
    .create = async_create,
    .signal = async_signal,
    .wait = async_wait,
    .finish = async_finish,
    .run = async_run,
};

Once you prepare the functions specialized for both misaligned memory and strictly-aligned memory access, the proper I/O implementations can be hooked up. Consequently, you don't have to check allow_ misalign all the time.

But the memory I/O implemention for for both misaligned memory and strictly-aligned memory access is the same, the only difference is that we need to handle exception if misaligned memory access is disabled.

@jserv
Copy link
Contributor

jserv commented Apr 7, 2023

the memory I/O implemention for for both misaligned memory and strictly-aligned memory access is the same, the only difference is that we need to handle exception if misaligned memory access is disabled.

Then, you can hook exception handlers which are aware of misaligned memory access by means of runtime dispatch.

@qwe661234
Copy link
Collaborator Author

qwe661234 commented Apr 7, 2023

the memory I/O implemention for for both misaligned memory and strictly-aligned memory access is the same, the only difference is that we need to handle exception if misaligned memory access is disabled.

Then, you can hook exception handlers which are aware of misaligned memory access by means of runtime dispatch.

Does this means send a callback function to memory read/ write? Because we don't need load/store handler if the memory misaligned access is enabled.

@jserv
Copy link
Contributor

jserv commented Apr 7, 2023

Then, you can hook exception handlers which are aware of misaligned memory access by means of runtime dispatch.
Does this means send a callback function to memory read/ write?

Sort of. In particular, we can shrink the code below:

/* SW: Store Word */
RVOP(sw, {
    const uint32_t addr = rv->X[ir->rs1] + ir->imm;
    if (!rv->is_misaliged_enabled && unlikely(addr & 3)) {
        rv->compressed = false;
        rv_except_store_misaligned(rv, addr);
        return false;
    ...

by replacing with something like the following:

/* SW: Store Word */
RVOP(sw, {
    const uint32_t addr = rv->X[ir->rs1] + ir->imm;
    RV_EXC_MISAGLIN_HANDLER(); /* some macro */

Meanwhile, we can hide the implementation details about allow_ misalign checks.

@qwe661234 qwe661234 force-pushed the wip/imporve_memory_io branch 2 times, most recently from 81c0276 to d54dd15 Compare April 7, 2023 17:42
src/riscv.h Outdated Show resolved Hide resolved
@qwe661234 qwe661234 force-pushed the wip/imporve_memory_io branch from d54dd15 to 5fa2782 Compare April 8, 2023 06:02
src/emulate.c Outdated Show resolved Hide resolved
src/emulate.c Outdated Show resolved Hide resolved
@qwe661234 qwe661234 force-pushed the wip/imporve_memory_io branch from 5fa2782 to 21a4036 Compare April 8, 2023 15:55
@qwe661234 qwe661234 requested a review from jserv April 8, 2023 15:56
@qwe661234 qwe661234 force-pushed the wip/imporve_memory_io branch from 21a4036 to 17931d9 Compare April 8, 2023 15:58
src/emulate.c Outdated Show resolved Hide resolved
Copy link
Contributor

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integrate memory pool in conjunction of proposed memory operation changes.

@qwe661234 qwe661234 force-pushed the wip/imporve_memory_io branch 2 times, most recently from 78aa320 to a159260 Compare April 9, 2023 12:33
src/emulate.c Outdated Show resolved Hide resolved
@qwe661234 qwe661234 force-pushed the wip/imporve_memory_io branch from a159260 to 57069ea Compare April 9, 2023 13:16
src/tlsf.c Fixed Show fixed Hide fixed
src/tlsf.c Fixed Show fixed Hide fixed
src/tlsf.c Fixed Show fixed Hide fixed
@qwe661234 qwe661234 force-pushed the wip/imporve_memory_io branch 2 times, most recently from be35ed3 to 6936a4f Compare April 9, 2023 15:18
src/mpool.c Show resolved Hide resolved
src/mpool.c Show resolved Hide resolved
src/io.c Outdated Show resolved Hide resolved
@qwe661234 qwe661234 force-pushed the wip/imporve_memory_io branch 5 times, most recently from 95b68af to 3785d16 Compare April 10, 2023 12:00
@qwe661234 qwe661234 requested a review from jserv April 10, 2023 12:38
Copy link
Contributor

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Append Close #122 at the end of git commit messages.

src/mpool.h Outdated Show resolved Hide resolved
@qwe661234 qwe661234 force-pushed the wip/imporve_memory_io branch 2 times, most recently from e7898e9 to 9a64da0 Compare April 10, 2023 13:47
@qwe661234 qwe661234 requested a review from jserv April 10, 2023 14:07
src/mpool.h Outdated
*/
#pragma once

#include <stdlib.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only <stddef.h> should be included here.

Copy link
Collaborator Author

@qwe661234 qwe661234 Apr 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gcc-12 compiler shows the warning when I only include <stddef.h>.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so.

gcc-12 -c mpool.h -Wall

It compiles without warnings. The actual header inclusion should appear in C source files.

src/mpool.c Outdated Show resolved Hide resolved
src/io.c Outdated Show resolved Hide resolved
src/io.c Outdated Show resolved Hide resolved
Copy link
Contributor

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rephrase the git commit message as following:

After analyzing the data collected during the execution of the Dhrystone benchmark, it was discovered that the primary performance bottleneck lies in memory read/write operations. To address this issue, modifications were made to the implementation of memory read/write operations, eliminating unnecessary checks and replacing slow operations with more efficient ones. Additionally, a simple memory pool was integrated. It is now possible for the user to enable misaligned memory access by launching with option "--misalign".

The enhancements made to the memory I/O resulted in a significant improvement, as observed through performance results obtained by running Dhrystone.

@qwe661234 qwe661234 force-pushed the wip/imporve_memory_io branch from 9a64da0 to c54ba4f Compare April 10, 2023 17:04
src/mpool.h Show resolved Hide resolved
src/io.c Outdated Show resolved Hide resolved
src/mpool.c Show resolved Hide resolved
@qwe661234 qwe661234 force-pushed the wip/imporve_memory_io branch from c54ba4f to 1a6af78 Compare April 11, 2023 12:39
src/emulate.c Outdated Show resolved Hide resolved
After analyzing the data collected during the execution of the Dhrystone
benchmark, it was discovered that the primary performance bottleneck lies
in memory read/write operations. To address this issue, modifications were
made to the implementation of memory read/write operations, eliminating
unnecessary checks and replacing slow operations with more efficient ones.
Additionally, a simple memory pool was integrated. It is now possible for
the user to enable misaligned memory access by launching with option "--misalign".

The enhancements made to the memory I/O resulted in a significant improvement,
as observed through performance results obtained by running Dhrystone.

|    Test    |    66200d0   |   improvement   |Speedup|
|------------+--------------+-----------------+-------|
| Dhrystone  | 815 DMIPS    |   1245 DMIPS    | +52.7%|

Close sysprog21#122
@qwe661234 qwe661234 force-pushed the wip/imporve_memory_io branch from 1a6af78 to 78112c3 Compare April 11, 2023 13:02
Copy link
Contributor

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Append a test item which triggers option --misalign in GitHub Actions. That is, we shall regularly validate the case when misaligned memory access is allowed for CI pipeline.

@jserv jserv merged commit 797a961 into sysprog21:master Apr 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants