Apply branch prediction for indirect jump #269

qwe661234 · 2023-11-19T08:46:08Z

Previously, it was necessary to perform a block cache lookup at the end of an indirect jump emulation; however, the associated overhead of this operation proved to be substantial. To mitigate this overhead, we have introduced a branch history table that captures the historical data of indirect jump targets. Given the limited number of entries in the branch history table, the lookup overhead is significantly reduced.

As shown in the performance analysis provided below, the branch history table has demonstrably enhanced the overall performance.

Metric	original	proposed
Dhrystone	2932.3 DMIPS	2985.2 DMIPS
CoreMark	2231 iter/s	2236 iter/s
Stream	76.04 sec	75.299 sec
Nqueens	4.069 sec	3.933 sec

See: #268

jserv · 2023-11-19T08:49:29Z

Can you check the strategies and implementations at https://github.com/bucaps/marss-riscv/tree/master/src/riscvsim/bpu as well?

src/rv32_template.c

RinHizakura · 2023-11-19T15:46:36Z

I'm not sure whether the history table has a significant effect on the performance. It looks like the history table is the cache of the block map, doesn't it?

According to my understanding, block_find() should be O(1) in the average case, and the entries in the block cache should only be evicted when doing the block_map_clear(). Since block_map_clear() should happen rarely, the probability of missing on the block map would be low.

qwe661234 · 2023-11-19T16:03:22Z

I'm not sure whether the history table has a significant effect on the performance. It looks like the history table is the cache of the block map, doesn't it?

According to my understanding, block_find() should be O(1) in the average case, and the entries in the block cache should only be evicted when doing the block_map_clear(). Since block_map_clear() should happen rarely, the probability of missing on the block map would be low.

The original design is block_find().

RinHizakura · 2023-11-19T16:16:28Z

The original design is block_find().

Sure, I know it is the original design.

No disrespect, I am just curious to know what's the meaning of the "overhead" on the original design. I assume that the original design has O(1) complexity on average, and the extra cost of the original design should only happen when "find miss" on the map. Because of these great features, I can't feel how the history table can improve on top of this. On the other hand, although the metric shows the proposed design is better, I suspect this improvement is still in the margin of error.

qwe661234 · 2023-11-19T16:28:53Z

The original design is block_find().

Sure, I know it is the original design.

No disrespect, I am just curious to know what's the meaning of the "overhead" on the original design. I assume that the original design has O(1) complexity on average, and the extra cost of the original design should only happen when "find miss" on the map. On the other hand, although the metric shows the proposed design is better, I suspect this improvement is still in the margin of error.

In the block_find(), it needs hash function, comparison and the overhead of function call. The overhead of branch history table is extra memory, but it only needs several comparisons without hash function and function call.

Actually, the branch table is designed for further improving JIT indirect jump codegen and its benefit on interpreter is not substantial.

RinHizakura · 2023-11-19T16:34:52Z

In the block_find(), it needs hash function, comparison and the overhead of function call. The overhead of branch history table is extra memory, but it only needs several comparisons without hash function and function call.

Actually, the branch table is designed for further improving JIT indirect jump codegen and its benefits on interpreter is not substantial.

Got it. Possibly because of the simple mechanism now in the history table with the interpreter mode, I can't see its power on the emulator. I also assume that the cost of hash function should be low which can be ignored, but this might not be true.

Thanks for your kind reply!

src/emulate.c

src/rv32_template.c

jserv · 2023-11-20T16:45:41Z

Can you check the strategies and implementations at https://github.com/bucaps/marss-riscv/tree/master/src/riscvsim/bpu as well?

The two-level adaptive predictor's strength lies in its ability to swiftly adapt to and predict repetitive patterns effectively. This prediction technique is employed in many contemporary microprocessors, and rv32emu could potentially gain advantages from adopting such an approach. Let's assess its performance using the reference code.

jserv

Fix typo ("proposed") in git commit message. Update the numbers if you have recent measurements.

Previously, it was necessary to perform a block cache lookup at the end of an indirect jump emulation; however, the associated overhead of this operation proved to be substantial. To mitigate this overhead, we have introduced a branch history table that captures the historical data of indirect jump targets. Given the limited number of entries in the branch history table, the lookup overhead is significantly reduced. As shown in the performance analysis provided below, the branch history table has demonstrably enhanced the overall performance. | Metric | original | proposed | |-----------+--------------+--------------| | Dhrystone | 2932.3 DMIPS | 2985.2 DMIPS | | CoreMark | 2231 iter/s | 2236 iter/s | | Stream | 76.04 sec | 75.299 sec | | Nqueens | 4.069 sec | 3.933 sec |

Apply branch prediction for indirect jump

github-advanced-security bot found potential problems Nov 19, 2023

View reviewed changes

src/rv32_template.c Fixed Show resolved Hide resolved

jserv requested a review from RinHizakura November 19, 2023 12:52

jserv reviewed Nov 19, 2023

View reviewed changes

src/rv32_template.c Outdated Show resolved Hide resolved

qwe661234 force-pushed the branch_predictor branch 2 times, most recently from 3d2bcff to fa3514c Compare November 20, 2023 08:39

jserv reviewed Nov 20, 2023

View reviewed changes

src/emulate.c Show resolved Hide resolved

jserv reviewed Nov 20, 2023

View reviewed changes

src/rv32_template.c Outdated Show resolved Hide resolved

qwe661234 force-pushed the branch_predictor branch from fa3514c to a5a4222 Compare November 20, 2023 12:50

RinHizakura approved these changes Nov 20, 2023

View reviewed changes

jserv reviewed Nov 20, 2023

View reviewed changes

src/rv32_template.c Show resolved Hide resolved

jserv requested changes Nov 20, 2023

View reviewed changes

qwe661234 force-pushed the branch_predictor branch from a5a4222 to 561c4c7 Compare November 21, 2023 09:11

qwe661234 force-pushed the branch_predictor branch from 561c4c7 to b24431c Compare November 21, 2023 09:12

qwe661234 requested a review from jserv November 21, 2023 09:12

jserv merged commit fb2ece9 into sysprog21:master Nov 21, 2023
21 checks passed

vestata pushed a commit to vestata/rv32emu that referenced this pull request Jan 24, 2025

Merge pull request sysprog21#269 from qwe661234/branch_predictor

56f59b8

Apply branch prediction for indirect jump

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apply branch prediction for indirect jump #269

Apply branch prediction for indirect jump #269

qwe661234 commented Nov 19, 2023 •

edited by jserv

Loading

jserv commented Nov 19, 2023

RinHizakura commented Nov 19, 2023 •

edited

Loading

qwe661234 commented Nov 19, 2023

RinHizakura commented Nov 19, 2023 •

edited

Loading

qwe661234 commented Nov 19, 2023 •

edited

Loading

RinHizakura commented Nov 19, 2023 •

edited

Loading

jserv commented Nov 20, 2023

jserv left a comment

Apply branch prediction for indirect jump #269

Apply branch prediction for indirect jump #269

Conversation

qwe661234 commented Nov 19, 2023 • edited by jserv Loading

jserv commented Nov 19, 2023

RinHizakura commented Nov 19, 2023 • edited Loading

qwe661234 commented Nov 19, 2023

RinHizakura commented Nov 19, 2023 • edited Loading

qwe661234 commented Nov 19, 2023 • edited Loading

RinHizakura commented Nov 19, 2023 • edited Loading

jserv commented Nov 20, 2023

jserv left a comment

Choose a reason for hiding this comment

qwe661234 commented Nov 19, 2023 •

edited by jserv

Loading

RinHizakura commented Nov 19, 2023 •

edited

Loading

RinHizakura commented Nov 19, 2023 •

edited

Loading

qwe661234 commented Nov 19, 2023 •

edited

Loading

RinHizakura commented Nov 19, 2023 •

edited

Loading