-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* fix performance regression Currently if I run benchmarks as of e2320de I get the following result: ``` hot/explicit_avx2_double time: [1.1531 ms 1.1539 ms 1.1548 ms] thrpt: [5.8529 GiB/s 5.8577 GiB/s 5.8620 GiB/s] ``` If I disassemble the function (though only on the actual binary, not with `cargo-show-asm`, annoyingly), I see out-of-line calls into iterator stuff, and probably a bunch of associated dumps of registers to memory and whatnot. ``` (lldb) disas bench-1f3b1b5d384583ba`_$LT$pixelfmt..uyvy_to_i420..ExplicitAvx2DoubleBlock$u20$as$u20$pixelfmt..uyvy_to_i420..RowProcessor$GT$::process::h94bb8a0388c0d852: ... 0x5555555c0f3b <+155>: vzeroupper 0x5555555c0f3e <+158>: callq 0x5555555c0a80 ; core::array::drain::drain_array_with::hce9734ca2363f2b8 -> 0x5555555c0f43 <+163>: movq 0x48(%rsp), %rdx 0x5555555c0f48 <+168>: movq %rbx, %r11 0x5555555c0f4b <+171>: movq %r12, %r9 ``` This doesn't happen at the previous commit. With a slight refactor, this `callq` goes away and performance returns. ``` hot/explicit_avx2_double time: [72.312 µs 72.370 µs 72.436 µs] thrpt: [93.312 GiB/s 93.398 GiB/s 93.472 GiB/s] change: time: [-93.730% -93.717% -93.703%] (p = 0.00 < 0.05) thrpt: [+1488.2% +1491.6% +1494.8%] Performance has improved. ``` I didn't record the Rust version I was using when preparing that change, so I'm unsure if I just missed the benchmark change or if it's dependent on Rust version. This is not entirely satisfying; I wanted to at least do a test on `cargo-show-asm` output that there are no unexpected `call` instructions, but it doesn't show them anyway... * prettier loops Not sure why I used `loop { if ... { break } }` rather than `while` to begin with, but easy enough to fix. No performance impact.
- Loading branch information
Showing
1 changed file
with
49 additions
and
53 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters