Skip to content

Commit

Permalink
Improve Benchmark Selection (#217)
Browse files Browse the repository at this point in the history
* Try slightly different benchmarks

* Try out different sum sizes

* Sum over sin of data

* reintroduce sum

* Fix typo

* Improve numbers reported

* Improve discussion in benchmark readme and CI

* Make link to benchmarking readme clickable

* Improve benchmark discussion

* Update .github/workflows/CI.yml

Co-authored-by: Hong Ge <3279477+yebai@users.noreply.github.com>

---------

Co-authored-by: Hong Ge <3279477+yebai@users.noreply.github.com>
  • Loading branch information
willtebbutt and yebai authored Aug 8, 2024
1 parent 1ccdafb commit 55cb4cf
Show file tree
Hide file tree
Showing 3 changed files with 13 additions and 5 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/CI.yml
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,6 @@ jobs:
uses: peter-evans/create-or-update-comment@v4
with:
issue-number: ${{ github.event.pull_request.number }}
body: "Performance Ratio:\nWarning: results are very approximate!\n```\n${{ steps.read-file.outputs.table }}\n```"
body: "Performance Ratio:\nRatio of time to compute gradient and time to compute function.\nWarning: results are very approximate! See [here](https://github.com/compintell/Tapir.jl/tree/main/bench#inter-framework-benchmarking) for more context.\n```\n${{ steps.read-file.outputs.table }}\n```"
comment-id: ${{ steps.fc.outputs.comment-id }}
edit-mode: replace
6 changes: 6 additions & 0 deletions bench/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,12 @@ plot_ratio_histogram!(df)
## Inter-framework Benchmarking

This comprises a small suite of functions that we AD using `Tapir.jl`, `Zygote.jl`, `ReverseDiff.jl`, and `Enzyme.jl`.
The primary purpose of this suite of benchmarks is to ensure that we're regularly comparing the performance of a range of reverse-mode ADs on a set of problems which are known to stretch them in various ways.
For any given function in the suite, some frameworks might have rules for it, and some not.
For example, `Zygote.jl` only achieves good performance on any of test cases because it has many rules.
For this reason, we include a hand-written version of `sum` and of `map`, on which `Zygote.jl` achieves poor performance.
`ReverseDiff.jl` has this property, although to a lesser extent than `Zygote.jl`.

This suite of benchmarks is also run as part of CI, and the output is recorded in two ways:
1. a table of results is posted as comment in a PR
1. the table and a corresponding graph are stored as github actions artifacts, and can be retrieved by going to the "Checks" tab of your PR, and clicking on the artifact button.
Expand Down
10 changes: 6 additions & 4 deletions bench/run_benchmarks.jl
Original file line number Diff line number Diff line change
Expand Up @@ -42,12 +42,12 @@ should_run_benchmark(args...) = true
# Test out the performance of a hand-written sum function, so we can be confident that there
# is no rule. Note that ReverseDiff has a (seemingly not fantastic) hand-written rule for
# sum.
function _sum(x::AbstractArray{<:Real})
function _sum(f::F, x::AbstractArray{<:Real}) where {F}
y = 0.0
n = 0
while n < length(x)
n += 1
y += x[n]
y += f(x[n])
end
return y
end
Expand Down Expand Up @@ -137,8 +137,10 @@ an array.
"""
function generate_inter_framework_tests()
return Any[
("sum", (sum, randn(100))),
("_sum", (_sum, randn(100))),
("sum_1000", (sum, randn(1_000))),
("_sum_1000", (x -> _sum(identity, x), randn(1_000))),
("sum_sin_1000", (x -> sum(sin, x), randn(1_000))),
("_sum_sin_1000", (x -> _sum(sin, x), randn(1_000))),
("kron_sum", (_kron_sum, randn(20, 20), randn(40, 40))),
("kron_view_sum", (_kron_view_sum, randn(40, 30), randn(40, 40))),
("naive_map_sin_cos_exp", (_naive_map_sin_cos_exp, randn(10, 10))),
Expand Down

0 comments on commit 55cb4cf

Please sign in to comment.