Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: cut allocations #691

Merged
merged 7 commits into from
Dec 7, 2023
Merged

Conversation

das7pad
Copy link
Contributor

@das7pad das7pad commented Aug 5, 2022

Summary of Changes

Hello!

I found a few "low hanging" allocations that can be deferred until needed or
even skipped entirely.

With all the optimizations combined, we can see the best improvement for
simple, static routes (like a /status endpoint) that do not read the Route
from the request context via CurrentRoute and do not populate any vars.
On these routes, we can process requests with a single allocation for the
RouteMatch object. Previously there were 9 extra allocations.
For said routes the processing overhead (ns/op) in mux dropped by 75%, which is
a speedup of 4x.

Other routes can expect to see a double-digit percentage reduction in both
processing overhead (ns/op) and allocations as well.
These are driven by merging the context population into a single operation,
eliminating two of ten allocations.

(Eliminating that last allocation for the RouteMatch in the best case
requires significant refactoring to maintain full backwards compatibility.
Something for another day.)

Each commit message contains benchmark results for showcasing particular
(micro) optimizations in reduced allocations and in a few cases notable direct
CPU time savings.

I also ran longer benchmarks with 100 repetitions in multiple settings on
different generations of (server) CPUs.
First, there is the full set of benchmarks in this repository and second, the
popular benchmarks https://github.com/julienschmidt/go-http-routing-benchmark.

All but the last change are entirely "free", as in they do not cut features for
gains in performance. The last change for omitting the Route from the
context is behind an optional flag that users can opt in when they do not read
the Route from the request context.
Said flag is stored local in a Router, so users can enable/disable the flag
on Subrouters individually.

Benchmark results

I added all the new benchmarks onto a baseline branch for comparing the
performance of the changes, tip is 0eba4f5.

I'm running these tests on "shared" compute instances (and my Laptop), so
expect some noise (and frequency scaling on the i7).

mux project benchmarks

You can reproduce these benchmarks using docker, pinned to CPU 1:

docker run --rm --pull always -v /logs:/logs --cpuset-cpus 1 -d golang:1.18 bash -exc 'git clone https://github.com/das7pad/mux.git && cd mux && for branch in baseline perf-cut-allocations; do git checkout "$branch" && go test -benchmem -bench . -count 100 -timeout 1h > "/logs/$branch-all.txt"; done; go install golang.org/x/perf/cmd/benchstat@latest; benchstat /logs/baseline-all.txt /logs/perf-cut-allocations-all.txt > /logs/compare-all.txt'
Modern Xeon E, 3.4 GHz
goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz

name                                 old time/op    new time/op    delta
Mux                                    1.09µs ± 5%    0.96µs ± 6%  -12.33%  (p=0.000 n=90+90)
MuxSimple/default                       654ns ± 5%     333ns ± 4%  -49.06%  (p=0.000 n=90+94)
MuxSimple/omit_route_from_ctx           655ns ± 5%     146ns ± 3%  -77.78%  (p=0.000 n=93+95)
MuxAlternativeInRegexp                 1.58µs ± 5%    1.30µs ± 3%  -17.75%  (p=0.000 n=92+90)
ManyPathVariables                      1.82µs ± 6%    1.65µs ± 5%   -9.73%  (p=0.000 n=94+93)
PopulateContext/no_populated_vars       665ns ± 5%     339ns ± 5%  -48.97%  (p=0.000 n=95+96)
PopulateContext/empty_var               910ns ± 5%     763ns ± 6%  -16.16%  (p=0.000 n=84+87)
PopulateContext/populated_vars          974ns ±10%     807ns ± 7%  -17.13%  (p=0.000 n=96+89)
PopulateContext/omit_route_/static      664ns ± 5%     148ns ± 3%  -77.66%  (p=0.000 n=94+89)
PopulateContext/omit_route_/dynamic     915ns ± 6%     741ns ± 6%  -18.98%  (p=0.000 n=90+94)
_findQueryKey/0                         158ns ± 2%     146ns ± 2%   -7.07%  (p=0.000 n=91+88)
_findQueryKey/1                         198ns ± 4%     196ns ± 5%   -0.56%  (p=0.001 n=92+95)
_findQueryKey/2                         710ns ± 4%     705ns ± 4%   -0.80%  (p=0.000 n=94+94)
_findQueryKey/3                         804ns ± 4%     798ns ± 5%   -0.82%  (p=0.000 n=90+94)
_findQueryKey/4                        3.85ns ± 2%    3.85ns ± 3%     ~     (p=0.224 n=89+93)
_findQueryKeyGoLib/0                    664ns ± 4%     668ns ± 5%   +0.67%  (p=0.030 n=97+94)
_findQueryKeyGoLib/1                    359ns ± 6%     359ns ± 6%     ~     (p=0.830 n=94+93)
_findQueryKeyGoLib/2                   2.24µs ± 3%    2.23µs ± 4%   -0.37%  (p=0.041 n=95+93)
_findQueryKeyGoLib/3                   2.94µs ± 4%    2.92µs ± 3%   -0.68%  (p=0.001 n=90+94)
_findQueryKeyGoLib/4                   3.63ns ± 4%    3.62ns ± 3%   -0.38%  (p=0.014 n=91+88)

name                                 old alloc/op   new alloc/op   delta
Mux                                    1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=100+100)
MuxSimple/default                      1.01kB ± 0%    0.50kB ± 0%  -50.79%  (p=0.000 n=100+100)
MuxSimple/omit_route_from_ctx          1.01kB ± 0%    0.05kB ± 0%  -95.24%  (p=0.000 n=100+100)
MuxAlternativeInRegexp                 2.62kB ± 0%    1.82kB ± 0%  -30.49%  (p=0.000 n=100+100)
ManyPathVariables                      1.53kB ± 1%    1.13kB ± 0%  -26.45%  (p=0.000 n=99+93)
PopulateContext/no_populated_vars      1.01kB ± 0%    0.50kB ± 0%  -50.79%  (p=0.000 n=100+100)
PopulateContext/empty_var              1.33kB ± 0%    0.93kB ± 0%  -30.12%  (p=0.000 n=100+100)
PopulateContext/populated_vars         1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=100+100)
PopulateContext/omit_route_/static     1.01kB ± 0%    0.05kB ± 0%  -95.24%  (p=0.000 n=100+100)
PopulateContext/omit_route_/dynamic    1.33kB ± 0%    0.88kB ± 0%  -33.73%  (p=0.000 n=100+100)
_findQueryKey/0                         0.00B          0.00B          ~     (all equal)
_findQueryKey/1                         40.0B ± 0%     40.0B ± 0%     ~     (all equal)
_findQueryKey/2                          483B ± 0%      483B ± 0%     ~     (all equal)
_findQueryKey/3                          543B ± 0%      543B ± 0%     ~     (all equal)
_findQueryKey/4                         0.00B          0.00B          ~     (all equal)
_findQueryKeyGoLib/0                     864B ± 0%      864B ± 0%     ~     (all equal)
_findQueryKeyGoLib/1                     432B ± 0%      432B ± 0%     ~     (all equal)
_findQueryKeyGoLib/2                   1.54kB ± 0%    1.54kB ± 0%     ~     (all equal)
_findQueryKeyGoLib/3                   1.98kB ± 0%    1.98kB ± 0%     ~     (all equal)
_findQueryKeyGoLib/4                    0.00B          0.00B          ~     (all equal)

name                                 old allocs/op  new allocs/op  delta
Mux                                      10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
MuxSimple/default                        9.00 ± 0%      4.00 ± 0%  -55.56%  (p=0.000 n=100+100)
MuxSimple/omit_route_from_ctx            9.00 ± 0%      1.00 ± 0%  -88.89%  (p=0.000 n=100+100)
MuxAlternativeInRegexp                   20.0 ± 0%      16.0 ± 0%  -20.00%  (p=0.000 n=100+100)
ManyPathVariables                        14.0 ± 0%      12.0 ± 0%  -14.29%  (p=0.000 n=100+100)
PopulateContext/no_populated_vars        9.00 ± 0%      4.00 ± 0%  -55.56%  (p=0.000 n=100+100)
PopulateContext/empty_var                11.0 ± 0%       9.0 ± 0%  -18.18%  (p=0.000 n=100+100)
PopulateContext/populated_vars           10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
PopulateContext/omit_route_/static       9.00 ± 0%      1.00 ± 0%  -88.89%  (p=0.000 n=100+100)
PopulateContext/omit_route_/dynamic      11.0 ± 0%       8.0 ± 0%  -27.27%  (p=0.000 n=100+100)
_findQueryKey/0                          0.00           0.00          ~     (all equal)
_findQueryKey/1                          3.00 ± 0%      3.00 ± 0%     ~     (all equal)
_findQueryKey/2                          10.0 ± 0%      10.0 ± 0%     ~     (all equal)
_findQueryKey/3                          11.0 ± 0%      11.0 ± 0%     ~     (all equal)
_findQueryKey/4                          0.00           0.00          ~     (all equal)
_findQueryKeyGoLib/0                     8.00 ± 0%      8.00 ± 0%     ~     (all equal)
_findQueryKeyGoLib/1                     4.00 ± 0%      4.00 ± 0%     ~     (all equal)
_findQueryKeyGoLib/2                     24.0 ± 0%      24.0 ± 0%     ~     (all equal)
_findQueryKeyGoLib/3                     28.0 ± 0%      28.0 ± 0%     ~     (all equal)
_findQueryKeyGoLib/4                     0.00           0.00          ~     (all equal)

Older Xeon Gold, 2.3 GHz

(Actual CPU identifier is rather Intel(R) Xeon(R) Gold 5122 CPU @ 2.30GHz)

goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel Xeon Processor (Skylake, IBRS) 

name                                 old time/op    new time/op    delta
Mux                                    1.61µs ± 6%    1.43µs ± 6%  -11.38%  (p=0.000 n=97+96)
MuxSimple/default                       979ns ± 8%     508ns ± 7%  -48.15%  (p=0.000 n=93+98)
MuxSimple/omit_route_from_ctx          1.00µs ±12%    0.22µs ± 4%  -77.76%  (p=0.000 n=99+96)
MuxAlternativeInRegexp                 2.40µs ± 9%    2.03µs ±12%  -15.15%  (p=0.000 n=94+98)
ManyPathVariables                      2.77µs ± 8%    2.60µs ±12%   -6.17%  (p=0.000 n=94+99)
PopulateContext/no_populated_vars      1.03µs ±11%    0.54µs ±13%  -47.57%  (p=0.000 n=94+98)
PopulateContext/empty_var              1.35µs ± 6%    1.23µs ±11%   -9.01%  (p=0.000 n=95+96)
PopulateContext/populated_vars         1.40µs ± 8%    1.30µs ±14%   -6.98%  (p=0.000 n=97+99)
PopulateContext/omit_route_/static      989ns ± 7%     239ns ±10%  -75.78%  (p=0.000 n=99+99)
PopulateContext/omit_route_/dynamic    1.34µs ± 7%    1.21µs ±15%  -10.04%  (p=0.000 n=95+99)
_findQueryKey/0                         242ns ± 8%     228ns ± 8%   -6.00%  (p=0.000 n=97+95)
_findQueryKey/1                         304ns ± 7%     309ns ± 9%   +1.60%  (p=0.002 n=97+97)
_findQueryKey/2                        1.08µs ± 8%    1.14µs ±10%   +5.28%  (p=0.000 n=99+97)
_findQueryKey/3                        1.20µs ± 6%    1.23µs ± 9%   +2.03%  (p=0.000 n=92+99)
_findQueryKey/4                        5.86ns ± 4%    5.89ns ± 5%   +0.56%  (p=0.041 n=93+96)
_findQueryKeyGoLib/0                    992ns ± 7%    1067ns ±11%   +7.59%  (p=0.000 n=94+98)
_findQueryKeyGoLib/1                    528ns ± 6%     571ns ±11%   +8.04%  (p=0.000 n=97+100)
_findQueryKeyGoLib/2                   3.46µs ± 6%    3.60µs ±10%   +4.18%  (p=0.000 n=96+95)
_findQueryKeyGoLib/3                   4.54µs ± 6%    4.75µs ±11%   +4.62%  (p=0.000 n=92+98)
_findQueryKeyGoLib/4                   5.50ns ± 4%    5.62ns ± 7%   +2.14%  (p=0.000 n=91+96)

name                                 old alloc/op   new alloc/op   delta
Mux                                    1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=100+100)
MuxSimple/default                      1.01kB ± 0%    0.50kB ± 0%  -50.79%  (p=0.000 n=100+100)
MuxSimple/omit_route_from_ctx          1.01kB ± 0%    0.05kB ± 0%  -95.24%  (p=0.000 n=100+100)
MuxAlternativeInRegexp                 2.62kB ± 0%    1.82kB ± 0%  -30.49%  (p=0.000 n=100+100)
ManyPathVariables                      1.52kB ± 0%    1.12kB ± 0%  -26.50%  (p=0.000 n=92+97)
PopulateContext/no_populated_vars      1.01kB ± 0%    0.50kB ± 0%  -50.79%  (p=0.000 n=100+100)
PopulateContext/empty_var              1.33kB ± 0%    0.93kB ± 0%  -30.12%  (p=0.000 n=100+100)
PopulateContext/populated_vars         1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=100+100)
PopulateContext/omit_route_/static     1.01kB ± 0%    0.05kB ± 0%  -95.24%  (p=0.000 n=100+100)
PopulateContext/omit_route_/dynamic    1.33kB ± 0%    0.88kB ± 0%  -33.73%  (p=0.000 n=100+100)
_findQueryKey/0                         0.00B          0.00B          ~     (all equal)
_findQueryKey/1                         40.0B ± 0%     40.0B ± 0%     ~     (all equal)
_findQueryKey/2                          483B ± 0%      483B ± 0%     ~     (all equal)
_findQueryKey/3                          543B ± 0%      543B ± 0%     ~     (all equal)
_findQueryKey/4                         0.00B          0.00B          ~     (all equal)
_findQueryKeyGoLib/0                     864B ± 0%      864B ± 0%     ~     (all equal)
_findQueryKeyGoLib/1                     432B ± 0%      432B ± 0%     ~     (all equal)
_findQueryKeyGoLib/2                   1.54kB ± 0%    1.54kB ± 0%     ~     (all equal)
_findQueryKeyGoLib/3                   1.98kB ± 0%    1.98kB ± 0%     ~     (all equal)
_findQueryKeyGoLib/4                    0.00B          0.00B          ~     (all equal)

name                                 old allocs/op  new allocs/op  delta
Mux                                      10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
MuxSimple/default                        9.00 ± 0%      4.00 ± 0%  -55.56%  (p=0.000 n=100+100)
MuxSimple/omit_route_from_ctx            9.00 ± 0%      1.00 ± 0%  -88.89%  (p=0.000 n=100+100)
MuxAlternativeInRegexp                   20.0 ± 0%      16.0 ± 0%  -20.00%  (p=0.000 n=100+100)
ManyPathVariables                        14.0 ± 0%      12.0 ± 0%  -14.29%  (p=0.000 n=100+100)
PopulateContext/no_populated_vars        9.00 ± 0%      4.00 ± 0%  -55.56%  (p=0.000 n=100+100)
PopulateContext/empty_var                11.0 ± 0%       9.0 ± 0%  -18.18%  (p=0.000 n=100+100)
PopulateContext/populated_vars           10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
PopulateContext/omit_route_/static       9.00 ± 0%      1.00 ± 0%  -88.89%  (p=0.000 n=100+100)
PopulateContext/omit_route_/dynamic      11.0 ± 0%       8.0 ± 0%  -27.27%  (p=0.000 n=100+100)
_findQueryKey/0                          0.00           0.00          ~     (all equal)
_findQueryKey/1                          3.00 ± 0%      3.00 ± 0%     ~     (all equal)
_findQueryKey/2                          10.0 ± 0%      10.0 ± 0%     ~     (all equal)
_findQueryKey/3                          11.0 ± 0%      11.0 ± 0%     ~     (all equal)
_findQueryKey/4                          0.00           0.00          ~     (all equal)
_findQueryKeyGoLib/0                     8.00 ± 0%      8.00 ± 0%     ~     (all equal)
_findQueryKeyGoLib/1                     4.00 ± 0%      4.00 ± 0%     ~     (all equal)
_findQueryKeyGoLib/2                     24.0 ± 0%      24.0 ± 0%     ~     (all equal)
_findQueryKeyGoLib/3                     28.0 ± 0%      28.0 ± 0%     ~     (all equal)
_findQueryKeyGoLib/4                     0.00           0.00          ~     (all equal)

Older Xeon E, 2.4 GHz
goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz

name                                 old time/op    new time/op    delta
Mux                                    1.81µs ±13%    1.57µs ±10%  -13.50%  (p=0.000 n=99+97)
MuxSimple/default                      1.09µs ± 8%    0.55µs ±10%  -49.51%  (p=0.000 n=96+95)
MuxSimple/omit_route_from_ctx          1.08µs ±10%    0.25µs ± 8%  -76.66%  (p=0.000 n=95+94)
MuxAlternativeInRegexp                 2.59µs ±12%    2.14µs ±13%  -17.06%  (p=0.000 n=95+92)
ManyPathVariables                      3.11µs ± 9%    2.78µs ±10%  -10.62%  (p=0.000 n=96+99)
PopulateContext/no_populated_vars      1.09µs ± 9%    0.56µs ± 7%  -48.63%  (p=0.000 n=94+93)
PopulateContext/empty_var              1.48µs ± 9%    1.28µs ±12%  -13.52%  (p=0.000 n=97+96)
PopulateContext/populated_vars         1.56µs ±11%    1.34µs ±10%  -13.84%  (p=0.000 n=96+97)
PopulateContext/omit_route_/static     1.08µs ± 9%    0.26µs ± 9%  -75.89%  (p=0.000 n=98+97)
PopulateContext/omit_route_/dynamic    1.50µs ±11%    1.20µs ± 9%  -19.96%  (p=0.000 n=94+93)
_findQueryKey/0                         254ns ± 8%     243ns ± 8%   -4.44%  (p=0.000 n=96+91)
_findQueryKey/1                         336ns ± 7%     336ns ± 8%     ~     (p=0.920 n=96+95)
_findQueryKey/2                        1.25µs ±12%    1.27µs ±14%   +1.55%  (p=0.043 n=97+98)
_findQueryKey/3                        1.37µs ± 8%    1.38µs ± 9%   +0.97%  (p=0.030 n=97+95)
_findQueryKey/4                        5.93ns ± 6%    5.97ns ± 7%     ~     (p=0.074 n=97+97)
_findQueryKeyGoLib/0                   1.07µs ± 9%    1.13µs ±10%   +4.99%  (p=0.000 n=96+95)
_findQueryKeyGoLib/1                    581ns ± 8%     586ns ± 9%     ~     (p=0.089 n=96+99)
_findQueryKeyGoLib/2                   3.93µs ±12%    3.80µs ±10%   -3.31%  (p=0.000 n=96+91)
_findQueryKeyGoLib/3                   5.18µs ±10%    5.06µs ± 8%   -2.27%  (p=0.000 n=98+99)
_findQueryKeyGoLib/4                   6.52ns ± 6%    6.49ns ± 4%     ~     (p=0.232 n=96+94)

name                                 old alloc/op   new alloc/op   delta
Mux                                    1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=100+100)
MuxSimple/default                      1.01kB ± 0%    0.50kB ± 0%  -50.79%  (p=0.000 n=100+100)
MuxSimple/omit_route_from_ctx          1.01kB ± 0%    0.05kB ± 0%  -95.24%  (p=0.000 n=100+100)
MuxAlternativeInRegexp                 2.62kB ± 0%    1.82kB ± 0%  -30.49%  (p=0.000 n=100+100)
ManyPathVariables                      1.52kB ± 0%    1.12kB ± 1%  -26.58%  (p=0.000 n=92+96)
PopulateContext/no_populated_vars      1.01kB ± 0%    0.50kB ± 0%  -50.79%  (p=0.000 n=100+100)
PopulateContext/empty_var              1.33kB ± 0%    0.93kB ± 0%  -30.12%  (p=0.000 n=100+100)
PopulateContext/populated_vars         1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=100+100)
PopulateContext/omit_route_/static     1.01kB ± 0%    0.05kB ± 0%  -95.24%  (p=0.000 n=100+100)
PopulateContext/omit_route_/dynamic    1.33kB ± 0%    0.88kB ± 0%  -33.73%  (p=0.000 n=100+100)
_findQueryKey/0                         0.00B          0.00B          ~     (all equal)
_findQueryKey/1                         40.0B ± 0%     40.0B ± 0%     ~     (all equal)
_findQueryKey/2                          483B ± 0%      483B ± 0%     ~     (all equal)
_findQueryKey/3                          543B ± 0%      543B ± 0%     ~     (all equal)
_findQueryKey/4                         0.00B          0.00B          ~     (all equal)
_findQueryKeyGoLib/0                     864B ± 0%      864B ± 0%     ~     (all equal)
_findQueryKeyGoLib/1                     432B ± 0%      432B ± 0%     ~     (all equal)
_findQueryKeyGoLib/2                   1.54kB ± 0%    1.54kB ± 0%     ~     (all equal)
_findQueryKeyGoLib/3                   1.98kB ± 0%    1.98kB ± 0%     ~     (all equal)
_findQueryKeyGoLib/4                    0.00B          0.00B          ~     (all equal)

name                                 old allocs/op  new allocs/op  delta
Mux                                      10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
MuxSimple/default                        9.00 ± 0%      4.00 ± 0%  -55.56%  (p=0.000 n=100+100)
MuxSimple/omit_route_from_ctx            9.00 ± 0%      1.00 ± 0%  -88.89%  (p=0.000 n=100+100)
MuxAlternativeInRegexp                   20.0 ± 0%      16.0 ± 0%  -20.00%  (p=0.000 n=100+100)
ManyPathVariables                        14.0 ± 0%      12.0 ± 0%  -14.29%  (p=0.000 n=100+100)
PopulateContext/no_populated_vars        9.00 ± 0%      4.00 ± 0%  -55.56%  (p=0.000 n=100+100)
PopulateContext/empty_var                11.0 ± 0%       9.0 ± 0%  -18.18%  (p=0.000 n=100+100)
PopulateContext/populated_vars           10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
PopulateContext/omit_route_/static       9.00 ± 0%      1.00 ± 0%  -88.89%  (p=0.000 n=100+100)
PopulateContext/omit_route_/dynamic      11.0 ± 0%       8.0 ± 0%  -27.27%  (p=0.000 n=100+100)
_findQueryKey/0                          0.00           0.00          ~     (all equal)
_findQueryKey/1                          3.00 ± 0%      3.00 ± 0%     ~     (all equal)
_findQueryKey/2                          10.0 ± 0%      10.0 ± 0%     ~     (all equal)
_findQueryKey/3                          11.0 ± 0%      11.0 ± 0%     ~     (all equal)
_findQueryKey/4                          0.00           0.00          ~     (all equal)
_findQueryKeyGoLib/0                     8.00 ± 0%      8.00 ± 0%     ~     (all equal)
_findQueryKeyGoLib/1                     4.00 ± 0%      4.00 ± 0%     ~     (all equal)
_findQueryKeyGoLib/2                     24.0 ± 0%      24.0 ± 0%     ~     (all equal)
_findQueryKeyGoLib/3                     28.0 ± 0%      28.0 ± 0%     ~     (all equal)
_findQueryKeyGoLib/4                     0.00           0.00          ~     (all equal)


Popular go-http-routing-benchmark

I pushed three branches for comparison to my fork:
https://github.com/das7pad/go-http-routing-benchmark

  • before, this is the baseline branch mentioned above
  • after, this is the PR revision
  • after-omit-route, like after with the OmitRouteFromContext flag enabled

You can reproduce these benchmarks using docker, pinned to CPU 1:

docker run --rm --pull always -v /logs:/logs --cpuset-cpus 1 -d golang:1.18 bash -exc 'git clone https://github.com/das7pad/go-http-routing-benchmark.git && cd go-http-routing-benchmark && for branch in before after after-omit-route; do git checkout "$branch" && go test -benchmem -bench Gorilla -count 100 -timeout 1h > "/logs/$branch.txt"; done; go install golang.org/x/perf/cmd/benchstat@latest; benchstat /logs/before.txt /logs/after.txt > /logs/compare-before-vs-after.txt; benchstat /logs/before.txt /logs/after-omit-route.txt > /logs/compare-before-vs-after-omit-route.txt'
Modern Xeon E, 3.4 GHz

Before vs After with omit Route flag enabled

goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz

name                     old time/op    new time/op    delta
GorillaMux_Param           1.86µs ±13%    1.37µs ± 6%  -26.08%  (p=0.000 n=94+96)
GorillaMux_Param5          2.61µs ± 4%    2.20µs ± 4%  -15.92%  (p=0.000 n=89+95)
GorillaMux_Param20         6.13µs ± 3%    3.98µs ± 4%  -35.07%  (p=0.000 n=90+94)
GorillaMux_ParamWrite      1.87µs ± 5%    1.43µs ± 8%  -23.51%  (p=0.000 n=95+92)
GorillaMux_GithubStatic    3.77µs ± 5%    2.43µs ± 4%  -35.40%  (p=0.000 n=96+96)
GorillaMux_GithubParam     5.68µs ± 6%    5.21µs ± 5%   -8.32%  (p=0.000 n=93+93)
GorillaMux_GithubAll       2.86ms ± 6%    2.71ms ± 4%   -5.44%  (p=0.000 n=99+96)
GorillaMux_GPlusStatic     1.31µs ± 5%    0.21µs ± 4%  -84.29%  (p=0.000 n=92+95)
GorillaMux_GPlusParam      2.37µs ± 3%    1.96µs ± 2%  -17.42%  (p=0.000 n=96+94)
GorillaMux_GPlus2Params    4.43µs ± 3%    3.90µs ± 3%  -11.89%  (p=0.000 n=91+90)
GorillaMux_GPlusAll        37.1µs ± 6%    28.9µs ± 5%  -22.26%  (p=0.000 n=93+92)
GorillaMux_ParseStatic     1.55µs ± 5%    0.42µs ± 4%  -72.78%  (p=0.000 n=95+91)
GorillaMux_ParseParam      1.88µs ± 7%    1.41µs ± 7%  -24.66%  (p=0.000 n=94+93)
GorillaMux_Parse2Params    2.21µs ± 4%    1.74µs ± 6%  -21.16%  (p=0.000 n=94+96)
GorillaMux_ParseAll        71.8µs ± 4%    51.3µs ± 4%  -28.51%  (p=0.000 n=93+94)
GorillaMux_StaticAll        771µs ± 4%     492µs ± 3%  -36.17%  (p=0.000 n=95+90)

name                     old alloc/op   new alloc/op   delta
GorillaMux_Param           1.31kB ± 0%    0.86kB ± 0%  -34.15%  (p=0.000 n=100+100)
GorillaMux_Param5          1.38kB ± 0%    0.93kB ± 0%  -32.56%  (p=0.000 n=100+100)
GorillaMux_Param20         3.48kB ± 0%    2.10kB ± 0%  -39.86%  (p=0.000 n=100+80)
GorillaMux_ParamWrite      1.31kB ± 0%    0.86kB ± 0%  -34.15%  (p=0.000 n=100+100)
GorillaMux_GithubStatic    1.01kB ± 0%    0.05kB ± 0%  -95.24%  (p=0.000 n=100+100)
GorillaMux_GithubParam     1.33kB ± 0%    0.88kB ± 0%  -33.73%  (p=0.000 n=100+100)
GorillaMux_GithubAll        258kB ± 0%     149kB ± 0%  -42.37%  (p=0.000 n=86+99)
GorillaMux_GPlusStatic     1.01kB ± 0%    0.05kB ± 0%  -95.24%  (p=0.000 n=100+100)
GorillaMux_GPlusParam      1.31kB ± 0%    0.86kB ± 0%  -34.15%  (p=0.000 n=100+100)
GorillaMux_GPlus2Params    1.33kB ± 0%    0.88kB ± 0%  -33.73%  (p=0.000 n=100+100)
GorillaMux_GPlusAll        16.5kB ± 0%     9.7kB ± 0%  -41.43%  (p=0.000 n=100+100)
GorillaMux_ParseStatic     1.01kB ± 0%    0.05kB ± 0%  -95.24%  (p=0.000 n=100+100)
GorillaMux_ParseParam      1.31kB ± 0%    0.86kB ± 0%  -34.15%  (p=0.000 n=100+100)
GorillaMux_Parse2Params    1.33kB ± 0%    0.88kB ± 0%  -33.73%  (p=0.000 n=100+100)
GorillaMux_ParseAll        31.1kB ± 0%    14.4kB ± 0%  -53.88%  (p=0.000 n=100+100)
GorillaMux_StaticAll        158kB ± 0%       8kB ± 0%  -95.24%  (p=0.000 n=100+100)

name                     old allocs/op  new allocs/op  delta
GorillaMux_Param             10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_Param5            10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_Param20           12.0 ± 0%       7.0 ± 0%  -41.67%  (p=0.000 n=100+100)
GorillaMux_ParamWrite        10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_GithubStatic      9.00 ± 0%      1.00 ± 0%  -88.89%  (p=0.000 n=100+100)
GorillaMux_GithubParam       10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_GithubAll        1.99k ± 0%     1.21k ± 0%  -39.57%  (p=0.000 n=100+100)
GorillaMux_GPlusStatic       9.00 ± 0%      1.00 ± 0%  -88.89%  (p=0.000 n=100+100)
GorillaMux_GPlusParam        10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_GPlus2Params      10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_GPlusAll           128 ± 0%        79 ± 0%  -38.28%  (p=0.000 n=100+100)
GorillaMux_ParseStatic       9.00 ± 0%      1.00 ± 0%  -88.89%  (p=0.000 n=100+100)
GorillaMux_ParseParam        10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_Parse2Params      10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_ParseAll           250 ± 0%       122 ± 0%  -51.20%  (p=0.000 n=100+100)
GorillaMux_StaticAll        1.41k ± 0%     0.16k ± 0%  -88.89%  (p=0.000 n=100+100)

Before vs After

goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz

name                     old time/op    new time/op    delta
GorillaMux_Param           1.86µs ±13%    1.47µs ±11%  -21.05%  (p=0.000 n=94+93)
GorillaMux_Param5          2.61µs ± 4%    2.25µs ± 2%  -13.64%  (p=0.000 n=89+90)
GorillaMux_Param20         6.13µs ± 3%    4.05µs ± 4%  -33.96%  (p=0.000 n=90+91)
GorillaMux_ParamWrite      1.87µs ± 5%    1.47µs ± 4%  -21.03%  (p=0.000 n=95+96)
GorillaMux_GithubStatic    3.77µs ± 5%    3.16µs ± 3%  -16.11%  (p=0.000 n=96+95)
GorillaMux_GithubParam     5.68µs ± 6%    5.31µs ± 6%   -6.57%  (p=0.000 n=93+94)
GorillaMux_GithubAll       2.86ms ± 6%    2.72ms ± 4%   -4.97%  (p=0.000 n=99+95)
GorillaMux_GPlusStatic     1.31µs ± 5%    0.66µs ± 6%  -49.50%  (p=0.000 n=92+95)
GorillaMux_GPlusParam      2.37µs ± 3%    2.04µs ± 4%  -13.87%  (p=0.000 n=96+90)
GorillaMux_GPlus2Params    4.43µs ± 3%    3.98µs ± 5%  -10.21%  (p=0.000 n=91+92)
GorillaMux_GPlusAll        37.1µs ± 6%    30.6µs ± 6%  -17.65%  (p=0.000 n=93+94)
GorillaMux_ParseStatic     1.55µs ± 5%    0.90µs ± 4%  -41.88%  (p=0.000 n=95+91)
GorillaMux_ParseParam      1.88µs ± 7%    1.47µs ± 6%  -21.53%  (p=0.000 n=94+95)
GorillaMux_Parse2Params    2.21µs ± 4%    1.80µs ± 5%  -18.26%  (p=0.000 n=94+90)
GorillaMux_ParseAll        71.8µs ± 4%    57.9µs ± 3%  -19.30%  (p=0.000 n=93+97)
GorillaMux_StaticAll        771µs ± 4%     664µs ± 4%  -13.87%  (p=0.000 n=95+95)

name                     old alloc/op   new alloc/op   delta
GorillaMux_Param           1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=100+100)
GorillaMux_Param5          1.38kB ± 0%    0.98kB ± 0%  -29.07%  (p=0.000 n=100+100)
GorillaMux_Param20         3.48kB ± 0%    2.14kB ± 0%  -38.48%  (p=0.000 n=100+83)
GorillaMux_ParamWrite      1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=100+100)
GorillaMux_GithubStatic    1.01kB ± 0%    0.50kB ± 0%  -50.79%  (p=0.000 n=100+100)
GorillaMux_GithubParam     1.33kB ± 0%    0.93kB ± 0%  -30.12%  (p=0.000 n=100+100)
GorillaMux_GithubAll        258kB ± 0%     173kB ± 0%  -33.02%  (p=0.000 n=86+100)
GorillaMux_GPlusStatic     1.01kB ± 0%    0.50kB ± 0%  -50.79%  (p=0.000 n=100+100)
GorillaMux_GPlusParam      1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=100+100)
GorillaMux_GPlus2Params    1.33kB ± 0%    0.93kB ± 0%  -30.12%  (p=0.000 n=100+100)
GorillaMux_GPlusAll        16.5kB ± 0%    11.1kB ± 0%  -32.82%  (p=0.000 n=100+100)
GorillaMux_ParseStatic     1.01kB ± 0%    0.50kB ± 0%  -50.79%  (p=0.000 n=100+100)
GorillaMux_ParseParam      1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=100+100)
GorillaMux_Parse2Params    1.33kB ± 0%    0.93kB ± 0%  -30.12%  (p=0.000 n=100+100)
GorillaMux_ParseAll        31.1kB ± 0%    19.6kB ± 0%  -37.02%  (p=0.000 n=100+100)
GorillaMux_StaticAll        158kB ± 0%      78kB ± 0%  -50.79%  (p=0.000 n=100+100)

name                     old allocs/op  new allocs/op  delta
GorillaMux_Param             10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_Param5            10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_Param20           12.0 ± 0%       8.0 ± 0%  -33.33%  (p=0.000 n=100+100)
GorillaMux_ParamWrite        10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_GithubStatic      9.00 ± 0%      4.00 ± 0%  -55.56%  (p=0.000 n=100+100)
GorillaMux_GithubParam       10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_GithubAll        1.99k ± 0%     1.48k ± 0%  -25.78%  (p=0.000 n=100+100)
GorillaMux_GPlusStatic       9.00 ± 0%      4.00 ± 0%  -55.56%  (p=0.000 n=100+100)
GorillaMux_GPlusParam        10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_GPlus2Params      10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_GPlusAll           128 ± 0%        96 ± 0%  -25.00%  (p=0.000 n=100+100)
GorillaMux_ParseStatic       9.00 ± 0%      4.00 ± 0%  -55.56%  (p=0.000 n=100+100)
GorillaMux_ParseParam        10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_Parse2Params      10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_ParseAll           250 ± 0%       168 ± 0%  -32.80%  (p=0.000 n=100+100)
GorillaMux_StaticAll        1.41k ± 0%     0.63k ± 0%  -55.56%  (p=0.000 n=100+100)

Older Xeon Gold, 2.3 GHz

Before vs After with omit Route flag enabled

(Actual CPU identifier is rather Intel(R) Xeon(R) Gold 5122 CPU @ 2.30GHz)

goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel Xeon Processor (Skylake, IBRS)

name                     old time/op    new time/op    delta
GorillaMux_Param           3.02µs ±14%    2.24µs ±12%  -25.60%  (p=0.000 n=100+99)
GorillaMux_Param5          4.20µs ±12%    3.45µs ±10%  -17.84%  (p=0.000 n=95+97)
GorillaMux_Param20         9.86µs ±12%    6.36µs ± 9%  -35.54%  (p=0.000 n=96+93)
GorillaMux_ParamWrite      2.97µs ± 7%    2.21µs ± 5%  -25.40%  (p=0.000 n=96+96)
GorillaMux_GithubStatic    5.82µs ± 6%    3.62µs ± 5%  -37.87%  (p=0.000 n=99+98)
GorillaMux_GithubParam     8.81µs ± 7%    7.90µs ± 5%  -10.37%  (p=0.000 n=97+95)
GorillaMux_GithubAll       3.93ms ± 4%    3.62ms ± 6%   -7.72%  (p=0.000 n=97+99)
GorillaMux_GPlusStatic     2.04µs ± 7%    0.33µs ± 9%  -83.77%  (p=0.000 n=98+99)
GorillaMux_GPlusParam      3.72µs ± 6%    3.14µs ±13%  -15.44%  (p=0.000 n=99+99)
GorillaMux_GPlus2Params    6.89µs ± 8%    6.57µs ± 9%   -4.70%  (p=0.000 n=92+98)
GorillaMux_GPlusAll        56.1µs ± 6%    46.5µs ±11%  -17.14%  (p=0.000 n=95+98)
GorillaMux_ParseStatic     2.43µs ± 6%    0.67µs ±11%  -72.49%  (p=0.000 n=96+99)
GorillaMux_ParseParam      2.91µs ± 8%    2.29µs ±12%  -21.07%  (p=0.000 n=98+97)
GorillaMux_Parse2Params    3.44µs ± 8%    2.84µs ± 8%  -17.53%  (p=0.000 n=98+95)
GorillaMux_ParseAll         111µs ± 5%      80µs ±11%  -27.91%  (p=0.000 n=96+98)
GorillaMux_StaticAll       1.13ms ± 4%    0.71ms ± 5%  -37.32%  (p=0.000 n=97+97)

name                     old alloc/op   new alloc/op   delta
GorillaMux_Param           1.31kB ± 0%    0.86kB ± 0%  -34.15%  (p=0.000 n=100+100)
GorillaMux_Param5          1.38kB ± 0%    0.93kB ± 0%  -32.56%  (p=0.000 n=100+100)
GorillaMux_Param20         3.48kB ± 0%    2.10kB ± 0%  -39.86%  (p=0.000 n=100+79)
GorillaMux_ParamWrite      1.31kB ± 0%    0.86kB ± 0%  -34.15%  (p=0.000 n=100+100)
GorillaMux_GithubStatic    1.01kB ± 0%    0.05kB ± 0%  -95.24%  (p=0.000 n=100+100)
GorillaMux_GithubParam     1.33kB ± 0%    0.88kB ± 0%  -33.73%  (p=0.000 n=100+100)
GorillaMux_GithubAll        258kB ± 0%     149kB ± 0%  -42.37%  (p=0.000 n=80+96)
GorillaMux_GPlusStatic     1.01kB ± 0%    0.05kB ± 0%  -95.24%  (p=0.000 n=100+100)
GorillaMux_GPlusParam      1.31kB ± 0%    0.86kB ± 0%  -34.15%  (p=0.000 n=100+100)
GorillaMux_GPlus2Params    1.33kB ± 0%    0.88kB ± 0%  -33.73%  (p=0.000 n=100+100)
GorillaMux_GPlusAll        16.5kB ± 0%     9.7kB ± 0%  -41.43%  (p=0.000 n=100+100)
GorillaMux_ParseStatic     1.01kB ± 0%    0.05kB ± 0%  -95.24%  (p=0.000 n=100+100)
GorillaMux_ParseParam      1.31kB ± 0%    0.86kB ± 0%  -34.15%  (p=0.000 n=100+100)
GorillaMux_Parse2Params    1.33kB ± 0%    0.88kB ± 0%  -33.73%  (p=0.000 n=100+100)
GorillaMux_ParseAll        31.1kB ± 0%    14.4kB ± 0%  -53.88%  (p=0.000 n=100+100)
GorillaMux_StaticAll        158kB ± 0%       8kB ± 0%  -95.24%  (p=0.000 n=100+100)

name                     old allocs/op  new allocs/op  delta
GorillaMux_Param             10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_Param5            10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_Param20           12.0 ± 0%       7.0 ± 0%  -41.67%  (p=0.000 n=100+100)
GorillaMux_ParamWrite        10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_GithubStatic      9.00 ± 0%      1.00 ± 0%  -88.89%  (p=0.000 n=100+100)
GorillaMux_GithubParam       10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_GithubAll        1.99k ± 0%     1.21k ± 0%  -39.57%  (p=0.000 n=100+100)
GorillaMux_GPlusStatic       9.00 ± 0%      1.00 ± 0%  -88.89%  (p=0.000 n=100+100)
GorillaMux_GPlusParam        10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_GPlus2Params      10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_GPlusAll           128 ± 0%        79 ± 0%  -38.28%  (p=0.000 n=100+100)
GorillaMux_ParseStatic       9.00 ± 0%      1.00 ± 0%  -88.89%  (p=0.000 n=100+100)
GorillaMux_ParseParam        10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_Parse2Params      10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_ParseAll           250 ± 0%       122 ± 0%  -51.20%  (p=0.000 n=100+100)
GorillaMux_StaticAll        1.41k ± 0%     0.16k ± 0%  -88.89%  (p=0.000 n=100+100)

Before vs After

(Actual CPU identifier is rather Intel(R) Xeon(R) Gold 5122 CPU @ 2.30GHz)

goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel Xeon Processor (Skylake, IBRS) 

name                     old time/op    new time/op    delta
GorillaMux_Param           3.02µs ±14%    2.27µs ± 8%  -24.70%  (p=0.000 n=100+98)
GorillaMux_Param5          4.20µs ±12%    3.48µs ± 5%  -17.29%  (p=0.000 n=95+98)
GorillaMux_Param20         9.86µs ±12%    6.41µs ± 6%  -35.05%  (p=0.000 n=96+92)
GorillaMux_ParamWrite      2.97µs ± 7%    2.31µs ± 6%  -22.12%  (p=0.000 n=96+98)
GorillaMux_GithubStatic    5.82µs ± 6%    4.69µs ± 5%  -19.52%  (p=0.000 n=99+96)
GorillaMux_GithubParam     8.81µs ± 7%    7.98µs ± 4%   -9.48%  (p=0.000 n=97+96)
GorillaMux_GithubAll       3.93ms ± 4%    3.64ms ± 5%   -7.18%  (p=0.000 n=97+94)
GorillaMux_GPlusStatic     2.04µs ± 7%    1.05µs ±12%  -48.53%  (p=0.000 n=98+94)
GorillaMux_GPlusParam      3.72µs ± 6%    3.09µs ± 5%  -16.76%  (p=0.000 n=99+96)
GorillaMux_GPlus2Params    6.89µs ± 8%    6.18µs ± 5%  -10.28%  (p=0.000 n=92+95)
GorillaMux_GPlusAll        56.1µs ± 6%    46.0µs ± 5%  -17.98%  (p=0.000 n=95+93)
GorillaMux_ParseStatic     2.43µs ± 6%    1.44µs ± 5%  -40.52%  (p=0.000 n=96+95)
GorillaMux_ParseParam      2.91µs ± 8%    2.30µs ± 5%  -20.93%  (p=0.000 n=98+98)
GorillaMux_Parse2Params    3.44µs ± 8%    2.82µs ± 7%  -18.04%  (p=0.000 n=98+94)
GorillaMux_ParseAll         111µs ± 5%      91µs ±13%  -18.12%  (p=0.000 n=96+98)
GorillaMux_StaticAll       1.13ms ± 4%    0.95ms ± 9%  -16.38%  (p=0.000 n=97+96)

name                     old alloc/op   new alloc/op   delta
GorillaMux_Param           1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=100+100)
GorillaMux_Param5          1.38kB ± 0%    0.98kB ± 0%  -29.07%  (p=0.000 n=100+100)
GorillaMux_Param20         3.48kB ± 0%    2.14kB ± 0%  -38.49%  (p=0.000 n=100+100)
GorillaMux_ParamWrite      1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=100+100)
GorillaMux_GithubStatic    1.01kB ± 0%    0.50kB ± 0%  -50.79%  (p=0.000 n=100+100)
GorillaMux_GithubParam     1.33kB ± 0%    0.93kB ± 0%  -30.12%  (p=0.000 n=100+100)
GorillaMux_GithubAll        258kB ± 0%     173kB ± 0%  -33.02%  (p=0.000 n=80+100)
GorillaMux_GPlusStatic     1.01kB ± 0%    0.50kB ± 0%  -50.79%  (p=0.000 n=100+100)
GorillaMux_GPlusParam      1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=100+100)
GorillaMux_GPlus2Params    1.33kB ± 0%    0.93kB ± 0%  -30.12%  (p=0.000 n=100+100)
GorillaMux_GPlusAll        16.5kB ± 0%    11.1kB ± 0%  -32.82%  (p=0.000 n=100+100)
GorillaMux_ParseStatic     1.01kB ± 0%    0.50kB ± 0%  -50.79%  (p=0.000 n=100+100)
GorillaMux_ParseParam      1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=100+100)
GorillaMux_Parse2Params    1.33kB ± 0%    0.93kB ± 0%  -30.12%  (p=0.000 n=100+100)
GorillaMux_ParseAll        31.1kB ± 0%    19.6kB ± 0%  -37.02%  (p=0.000 n=100+100)
GorillaMux_StaticAll        158kB ± 0%      78kB ± 0%  -50.79%  (p=0.000 n=100+95)

name                     old allocs/op  new allocs/op  delta
GorillaMux_Param             10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_Param5            10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_Param20           12.0 ± 0%       8.0 ± 0%  -33.33%  (p=0.000 n=100+100)
GorillaMux_ParamWrite        10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_GithubStatic      9.00 ± 0%      4.00 ± 0%  -55.56%  (p=0.000 n=100+100)
GorillaMux_GithubParam       10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_GithubAll        1.99k ± 0%     1.48k ± 0%  -25.78%  (p=0.000 n=100+100)
GorillaMux_GPlusStatic       9.00 ± 0%      4.00 ± 0%  -55.56%  (p=0.000 n=100+100)
GorillaMux_GPlusParam        10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_GPlus2Params      10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_GPlusAll           128 ± 0%        96 ± 0%  -25.00%  (p=0.000 n=100+100)
GorillaMux_ParseStatic       9.00 ± 0%      4.00 ± 0%  -55.56%  (p=0.000 n=100+100)
GorillaMux_ParseParam        10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_Parse2Params      10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_ParseAll           250 ± 0%       168 ± 0%  -32.80%  (p=0.000 n=100+100)
GorillaMux_StaticAll        1.41k ± 0%     0.63k ± 0%  -55.56%  (p=0.000 n=100+100)

Older Xeon E, 2.4 GHz

Before vs After with omit Route flag enabled

goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz

name                     old time/op    new time/op    delta
GorillaMux_Param           3.31µs ±11%    2.44µs ±11%  -26.34%  (p=0.000 n=95+92)
GorillaMux_Param5          4.68µs ± 9%    3.96µs ±14%  -15.22%  (p=0.000 n=97+100)
GorillaMux_Param20         11.1µs ±10%     7.2µs ± 9%  -35.03%  (p=0.000 n=100+97)
GorillaMux_ParamWrite      3.37µs ±10%    2.56µs ± 8%  -24.21%  (p=0.000 n=95+94)
GorillaMux_GithubStatic    6.90µs ± 7%    3.85µs ± 8%  -44.16%  (p=0.000 n=96+91)
GorillaMux_GithubParam     9.85µs ± 6%    8.89µs ±11%   -9.74%  (p=0.000 n=98+96)
GorillaMux_GithubAll       5.15ms ±10%    4.77ms ± 7%   -7.36%  (p=0.000 n=96+96)
GorillaMux_GPlusStatic     2.38µs ± 8%    0.36µs ± 7%  -84.98%  (p=0.000 n=95+94)
GorillaMux_GPlusParam      4.30µs ±11%    3.33µs ±10%  -22.60%  (p=0.000 n=97+95)
GorillaMux_GPlus2Params    7.80µs ± 8%    6.82µs ±10%  -12.55%  (p=0.000 n=98+94)
GorillaMux_GPlusAll        65.8µs ±10%    50.5µs ±10%  -23.24%  (p=0.000 n=97+98)
GorillaMux_ParseStatic     2.98µs ±10%    0.70µs ± 9%  -76.68%  (p=0.000 n=99+94)
GorillaMux_ParseParam      3.31µs ± 9%    2.55µs ± 9%  -23.11%  (p=0.000 n=96+95)
GorillaMux_Parse2Params    3.92µs ± 8%    3.18µs ±12%  -18.86%  (p=0.000 n=97+96)
GorillaMux_ParseAll         125µs ± 9%      90µs ±12%  -28.23%  (p=0.000 n=99+96)
GorillaMux_StaticAll       1.32ms ± 6%    0.79ms ± 6%  -40.41%  (p=0.000 n=99+95)

name                     old alloc/op   new alloc/op   delta
GorillaMux_Param           1.31kB ± 0%    0.86kB ± 0%  -34.15%  (p=0.000 n=100+100)
GorillaMux_Param5          1.38kB ± 0%    0.93kB ± 0%  -32.56%  (p=0.000 n=100+100)
GorillaMux_Param20         3.48kB ± 0%    2.10kB ± 0%  -39.86%  (p=0.000 n=100+79)
GorillaMux_ParamWrite      1.31kB ± 0%    0.86kB ± 0%  -34.15%  (p=0.000 n=100+100)
GorillaMux_GithubStatic    1.01kB ± 0%    0.05kB ± 0%  -95.24%  (p=0.000 n=100+100)
GorillaMux_GithubParam     1.33kB ± 0%    0.88kB ± 0%  -33.73%  (p=0.000 n=100+100)
GorillaMux_GithubAll        258kB ± 0%     149kB ± 0%  -42.37%  (p=0.000 n=79+86)
GorillaMux_GPlusStatic     1.01kB ± 0%    0.05kB ± 0%  -95.24%  (p=0.000 n=100+100)
GorillaMux_GPlusParam      1.31kB ± 0%    0.86kB ± 0%  -34.15%  (p=0.000 n=100+100)
GorillaMux_GPlus2Params    1.33kB ± 0%    0.88kB ± 0%  -33.73%  (p=0.000 n=100+100)
GorillaMux_GPlusAll        16.5kB ± 0%     9.7kB ± 0%  -41.43%  (p=0.000 n=100+100)
GorillaMux_ParseStatic     1.01kB ± 0%    0.05kB ± 0%  -95.24%  (p=0.000 n=100+100)
GorillaMux_ParseParam      1.31kB ± 0%    0.86kB ± 0%  -34.15%  (p=0.000 n=100+100)
GorillaMux_Parse2Params    1.33kB ± 0%    0.88kB ± 0%  -33.73%  (p=0.000 n=100+100)
GorillaMux_ParseAll        31.1kB ± 0%    14.4kB ± 0%  -53.88%  (p=0.000 n=100+100)
GorillaMux_StaticAll        158kB ± 0%       8kB ± 0%  -95.24%  (p=0.000 n=100+100)

name                     old allocs/op  new allocs/op  delta
GorillaMux_Param             10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_Param5            10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_Param20           12.0 ± 0%       7.0 ± 0%  -41.67%  (p=0.000 n=100+100)
GorillaMux_ParamWrite        10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_GithubStatic      9.00 ± 0%      1.00 ± 0%  -88.89%  (p=0.000 n=100+100)
GorillaMux_GithubParam       10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_GithubAll        1.99k ± 0%     1.21k ± 0%  -39.57%  (p=0.000 n=100+100)
GorillaMux_GPlusStatic       9.00 ± 0%      1.00 ± 0%  -88.89%  (p=0.000 n=100+100)
GorillaMux_GPlusParam        10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_GPlus2Params      10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_GPlusAll           128 ± 0%        79 ± 0%  -38.28%  (p=0.000 n=100+100)
GorillaMux_ParseStatic       9.00 ± 0%      1.00 ± 0%  -88.89%  (p=0.000 n=100+100)
GorillaMux_ParseParam        10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_Parse2Params      10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=100+100)
GorillaMux_ParseAll           250 ± 0%       122 ± 0%  -51.20%  (p=0.000 n=100+100)
GorillaMux_StaticAll        1.41k ± 0%     0.16k ± 0%  -88.89%  (p=0.000 n=100+100)

Before vs After

goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz

name                     old time/op    new time/op    delta
GorillaMux_Param           3.31µs ±11%    2.59µs ±12%  -21.76%  (p=0.000 n=95+96)
GorillaMux_Param5          4.68µs ± 9%    3.80µs ± 7%  -18.79%  (p=0.000 n=97+92)
GorillaMux_Param20         11.1µs ±10%     7.2µs ± 9%  -35.19%  (p=0.000 n=100+99)
GorillaMux_ParamWrite      3.37µs ±10%    2.68µs ± 7%  -20.69%  (p=0.000 n=95+95)
GorillaMux_GithubStatic    6.90µs ± 7%    5.06µs ± 7%  -26.57%  (p=0.000 n=96+91)
GorillaMux_GithubParam     9.85µs ± 6%    8.83µs ± 9%  -10.42%  (p=0.000 n=98+98)
GorillaMux_GithubAll       5.15ms ±10%    4.71ms ± 9%   -8.50%  (p=0.000 n=96+97)
GorillaMux_GPlusStatic     2.38µs ± 8%    1.19µs ±10%  -50.24%  (p=0.000 n=95+100)
GorillaMux_GPlusParam      4.30µs ±11%    3.40µs ± 5%  -21.08%  (p=0.000 n=97+96)
GorillaMux_GPlus2Params    7.80µs ± 8%    6.96µs ± 8%  -10.71%  (p=0.000 n=98+98)
GorillaMux_GPlusAll        65.8µs ±10%    52.2µs ±10%  -20.65%  (p=0.000 n=97+96)
GorillaMux_ParseStatic     2.98µs ±10%    1.61µs ±10%  -46.04%  (p=0.000 n=99+100)
GorillaMux_ParseParam      3.31µs ± 9%    2.63µs ± 7%  -20.46%  (p=0.000 n=96+91)
GorillaMux_Parse2Params    3.92µs ± 8%    3.23µs ± 6%  -17.58%  (p=0.000 n=97+93)
GorillaMux_ParseAll         125µs ± 9%     102µs ± 9%  -18.79%  (p=0.000 n=99+96)
GorillaMux_StaticAll       1.32ms ± 6%    1.07ms ± 7%  -18.78%  (p=0.000 n=99+95)

name                     old alloc/op   new alloc/op   delta
GorillaMux_Param           1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=100+100)
GorillaMux_Param5          1.38kB ± 0%    0.98kB ± 0%  -29.07%  (p=0.000 n=100+100)
GorillaMux_Param20         3.48kB ± 0%    2.14kB ± 0%  -38.49%  (p=0.000 n=100+100)
GorillaMux_ParamWrite      1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=100+100)
GorillaMux_GithubStatic    1.01kB ± 0%    0.50kB ± 0%  -50.79%  (p=0.000 n=100+100)
GorillaMux_GithubParam     1.33kB ± 0%    0.93kB ± 0%  -30.12%  (p=0.000 n=100+100)
GorillaMux_GithubAll        258kB ± 0%     173kB ± 0%  -33.02%  (p=0.000 n=79+96)
GorillaMux_GPlusStatic     1.01kB ± 0%    0.50kB ± 0%  -50.79%  (p=0.000 n=100+100)
GorillaMux_GPlusParam      1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=100+100)
GorillaMux_GPlus2Params    1.33kB ± 0%    0.93kB ± 0%  -30.12%  (p=0.000 n=100+100)
GorillaMux_GPlusAll        16.5kB ± 0%    11.1kB ± 0%  -32.82%  (p=0.000 n=100+100)
GorillaMux_ParseStatic     1.01kB ± 0%    0.50kB ± 0%  -50.79%  (p=0.000 n=100+100)
GorillaMux_ParseParam      1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=100+100)
GorillaMux_Parse2Params    1.33kB ± 0%    0.93kB ± 0%  -30.12%  (p=0.000 n=100+100)
GorillaMux_ParseAll        31.1kB ± 0%    19.6kB ± 0%  -37.02%  (p=0.000 n=100+100)
GorillaMux_StaticAll        158kB ± 0%      78kB ± 0%  -50.79%  (p=0.000 n=100+100)

name                     old allocs/op  new allocs/op  delta
GorillaMux_Param             10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_Param5            10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_Param20           12.0 ± 0%       8.0 ± 0%  -33.33%  (p=0.000 n=100+100)
GorillaMux_ParamWrite        10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_GithubStatic      9.00 ± 0%      4.00 ± 0%  -55.56%  (p=0.000 n=100+100)
GorillaMux_GithubParam       10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_GithubAll        1.99k ± 0%     1.48k ± 0%  -25.78%  (p=0.000 n=100+100)
GorillaMux_GPlusStatic       9.00 ± 0%      4.00 ± 0%  -55.56%  (p=0.000 n=100+100)
GorillaMux_GPlusParam        10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_GPlus2Params      10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_GPlusAll           128 ± 0%        96 ± 0%  -25.00%  (p=0.000 n=100+100)
GorillaMux_ParseStatic       9.00 ± 0%      4.00 ± 0%  -55.56%  (p=0.000 n=100+100)
GorillaMux_ParseParam        10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_Parse2Params      10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=100+100)
GorillaMux_ParseAll           250 ± 0%       168 ± 0%  -32.80%  (p=0.000 n=100+100)
GorillaMux_StaticAll        1.41k ± 0%     0.63k ± 0%  -55.56%  (p=0.000 n=100+100)

Older i7, frequency scaling around 3.4 GHz, n=10

Sorry, only 10 iterations each. I do not want to hear the fan for too long :)

Before vs After with omit Route flag enabled

goos: linux
goarch: amd64
pkg: github.com/julienschmidt/go-http-routing-benchmark
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz

name                     old time/op    new time/op    delta
GorillaMux_Param           1.80µs ± 2%    1.41µs ± 3%  -21.64%  (p=0.000 n=10+10)
GorillaMux_Param5          2.68µs ± 0%    2.34µs ± 1%  -12.43%  (p=0.000 n=10+10)
GorillaMux_Param20         5.92µs ± 0%    4.02µs ± 0%  -32.10%  (p=0.000 n=10+9)
GorillaMux_ParamWrite      1.89µs ± 1%    1.47µs ± 0%  -22.22%  (p=0.000 n=9+8)
GorillaMux_GithubStatic    4.05µs ± 1%    2.80µs ± 0%  -31.04%  (p=0.000 n=9+10)
GorillaMux_GithubParam     6.30µs ± 1%    5.90µs ± 1%   -6.27%  (p=0.000 n=10+10)
GorillaMux_GithubAll       3.14ms ± 1%    2.87ms ± 2%   -8.42%  (p=0.000 n=10+10)
GorillaMux_GPlusStatic     1.29µs ± 1%    0.23µs ± 1%  -82.37%  (p=0.000 n=10+10)
GorillaMux_GPlusParam      2.52µs ± 0%    2.07µs ± 1%  -17.69%  (p=0.000 n=10+10)
GorillaMux_GPlus2Params    4.92µs ± 0%    4.49µs ± 1%   -8.71%  (p=0.000 n=10+10)
GorillaMux_GPlusAll        39.0µs ± 1%    31.5µs ± 1%  -19.28%  (p=0.000 n=10+10)
GorillaMux_ParseStatic     1.58µs ± 1%    0.47µs ± 1%  -70.01%  (p=0.000 n=10+10)
GorillaMux_ParseParam      1.89µs ± 0%    1.46µs ± 1%  -22.70%  (p=0.000 n=10+10)
GorillaMux_Parse2Params    2.31µs ± 2%    1.87µs ± 0%  -18.96%  (p=0.000 n=10+8)
GorillaMux_ParseAll        74.3µs ± 1%    55.4µs ± 0%  -25.48%  (p=0.000 n=10+10)
GorillaMux_StaticAll        797µs ± 0%     561µs ± 1%  -29.66%  (p=0.000 n=10+10)

name                     old alloc/op   new alloc/op   delta
GorillaMux_Param           1.31kB ± 0%    0.86kB ± 0%  -34.15%  (p=0.000 n=10+10)
GorillaMux_Param5          1.38kB ± 0%    0.93kB ± 0%  -32.56%  (p=0.000 n=10+10)
GorillaMux_Param20         3.48kB ± 0%    2.09kB ± 0%  -39.87%  (p=0.000 n=10+10)
GorillaMux_ParamWrite      1.31kB ± 0%    0.86kB ± 0%  -34.15%  (p=0.000 n=10+10)
GorillaMux_GithubStatic    1.01kB ± 0%    0.05kB ± 0%  -95.24%  (p=0.000 n=10+10)
GorillaMux_GithubParam     1.33kB ± 0%    0.88kB ± 0%  -33.73%  (p=0.000 n=10+10)
GorillaMux_GithubAll        258kB ± 0%     149kB ± 0%  -42.37%  (p=0.000 n=10+9)
GorillaMux_GPlusStatic     1.01kB ± 0%    0.05kB ± 0%  -95.24%  (p=0.000 n=10+10)
GorillaMux_GPlusParam      1.31kB ± 0%    0.86kB ± 0%  -34.15%  (p=0.000 n=10+10)
GorillaMux_GPlus2Params    1.33kB ± 0%    0.88kB ± 0%  -33.73%  (p=0.000 n=10+10)
GorillaMux_GPlusAll        16.5kB ± 0%     9.7kB ± 0%  -41.43%  (p=0.000 n=10+10)
GorillaMux_ParseStatic     1.01kB ± 0%    0.05kB ± 0%  -95.24%  (p=0.000 n=10+10)
GorillaMux_ParseParam      1.31kB ± 0%    0.86kB ± 0%  -34.15%  (p=0.000 n=10+10)
GorillaMux_Parse2Params    1.33kB ± 0%    0.88kB ± 0%  -33.73%  (p=0.000 n=10+10)
GorillaMux_ParseAll        31.1kB ± 0%    14.4kB ± 0%  -53.88%  (p=0.000 n=10+10)
GorillaMux_StaticAll        158kB ± 0%       8kB ± 0%  -95.24%  (p=0.000 n=10+10)

name                     old allocs/op  new allocs/op  delta
GorillaMux_Param             10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=10+10)
GorillaMux_Param5            10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=10+10)
GorillaMux_Param20           12.0 ± 0%       7.0 ± 0%  -41.67%  (p=0.000 n=10+10)
GorillaMux_ParamWrite        10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=10+10)
GorillaMux_GithubStatic      9.00 ± 0%      1.00 ± 0%  -88.89%  (p=0.000 n=10+10)
GorillaMux_GithubParam       10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=10+10)
GorillaMux_GithubAll        1.99k ± 0%     1.21k ± 0%  -39.57%  (p=0.000 n=10+10)
GorillaMux_GPlusStatic       9.00 ± 0%      1.00 ± 0%  -88.89%  (p=0.000 n=10+10)
GorillaMux_GPlusParam        10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=10+10)
GorillaMux_GPlus2Params      10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=10+10)
GorillaMux_GPlusAll           128 ± 0%        79 ± 0%  -38.28%  (p=0.000 n=10+10)
GorillaMux_ParseStatic       9.00 ± 0%      1.00 ± 0%  -88.89%  (p=0.000 n=10+10)
GorillaMux_ParseParam        10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=10+10)
GorillaMux_Parse2Params      10.0 ± 0%       7.0 ± 0%  -30.00%  (p=0.000 n=10+10)
GorillaMux_ParseAll           250 ± 0%       122 ± 0%  -51.20%  (p=0.000 n=10+10)
GorillaMux_StaticAll        1.41k ± 0%     0.16k ± 0%  -88.89%  (p=0.000 n=10+10)

Before vs After

goos: linux
goarch: amd64
pkg: github.com/julienschmidt/go-http-routing-benchmark
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz

name                     old time/op    new time/op    delta
GorillaMux_Param           1.80µs ± 2%    1.38µs ± 2%  -23.29%  (p=0.000 n=10+10)
GorillaMux_Param5          2.68µs ± 0%    2.29µs ± 1%  -14.51%  (p=0.000 n=10+10)
GorillaMux_Param20         5.92µs ± 0%    3.95µs ± 1%  -33.32%  (p=0.000 n=10+10)
GorillaMux_ParamWrite      1.89µs ± 1%    1.46µs ± 2%  -22.59%  (p=0.000 n=9+10)
GorillaMux_GithubStatic    4.05µs ± 1%    3.41µs ± 1%  -15.89%  (p=0.000 n=9+9)
GorillaMux_GithubParam     6.30µs ± 1%    5.80µs ± 1%   -7.91%  (p=0.000 n=10+10)
GorillaMux_GithubAll       3.14ms ± 1%    2.89ms ± 1%   -7.79%  (p=0.000 n=10+10)
GorillaMux_GPlusStatic     1.29µs ± 1%    0.65µs ± 1%  -49.87%  (p=0.000 n=10+10)
GorillaMux_GPlusParam      2.52µs ± 0%    2.14µs ± 1%  -15.03%  (p=0.000 n=10+10)
GorillaMux_GPlus2Params    4.92µs ± 0%    4.46µs ± 1%   -9.40%  (p=0.000 n=10+10)
GorillaMux_GPlusAll        39.0µs ± 1%    33.0µs ± 1%  -15.53%  (p=0.000 n=10+10)
GorillaMux_ParseStatic     1.58µs ± 1%    0.93µs ± 1%  -41.36%  (p=0.000 n=10+10)
GorillaMux_ParseParam      1.89µs ± 0%    1.51µs ± 2%  -20.23%  (p=0.000 n=10+9)
GorillaMux_Parse2Params    2.31µs ± 2%    1.92µs ± 1%  -16.95%  (p=0.000 n=10+10)
GorillaMux_ParseAll        74.3µs ± 1%    60.9µs ± 1%  -17.99%  (p=0.000 n=10+10)
GorillaMux_StaticAll        797µs ± 0%     699µs ± 0%  -12.25%  (p=0.000 n=10+10)

name                     old alloc/op   new alloc/op   delta
GorillaMux_Param           1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=10+10)
GorillaMux_Param5          1.38kB ± 0%    0.98kB ± 0%  -29.07%  (p=0.000 n=10+10)
GorillaMux_Param20         3.48kB ± 0%    2.14kB ± 0%  -38.48%  (p=0.000 n=10+10)
GorillaMux_ParamWrite      1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=10+10)
GorillaMux_GithubStatic    1.01kB ± 0%    0.50kB ± 0%  -50.79%  (p=0.000 n=10+10)
GorillaMux_GithubParam     1.33kB ± 0%    0.93kB ± 0%  -30.12%  (p=0.000 n=10+10)
GorillaMux_GithubAll        258kB ± 0%     173kB ± 0%  -33.02%  (p=0.000 n=10+10)
GorillaMux_GPlusStatic     1.01kB ± 0%    0.50kB ± 0%  -50.79%  (p=0.000 n=10+10)
GorillaMux_GPlusParam      1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=10+10)
GorillaMux_GPlus2Params    1.33kB ± 0%    0.93kB ± 0%  -30.12%  (p=0.000 n=10+10)
GorillaMux_GPlusAll        16.5kB ± 0%    11.1kB ± 0%  -32.82%  (p=0.000 n=10+10)
GorillaMux_ParseStatic     1.01kB ± 0%    0.50kB ± 0%  -50.79%  (p=0.000 n=10+10)
GorillaMux_ParseParam      1.31kB ± 0%    0.91kB ± 0%  -30.49%  (p=0.000 n=10+10)
GorillaMux_Parse2Params    1.33kB ± 0%    0.93kB ± 0%  -30.12%  (p=0.000 n=10+10)
GorillaMux_ParseAll        31.1kB ± 0%    19.6kB ± 0%  -37.02%  (p=0.000 n=10+10)
GorillaMux_StaticAll        158kB ± 0%      78kB ± 0%  -50.79%  (p=0.000 n=10+10)

name                     old allocs/op  new allocs/op  delta
GorillaMux_Param             10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=10+10)
GorillaMux_Param5            10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=10+10)
GorillaMux_Param20           12.0 ± 0%       8.0 ± 0%  -33.33%  (p=0.000 n=10+10)
GorillaMux_ParamWrite        10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=10+10)
GorillaMux_GithubStatic      9.00 ± 0%      4.00 ± 0%  -55.56%  (p=0.000 n=10+10)
GorillaMux_GithubParam       10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=10+10)
GorillaMux_GithubAll        1.99k ± 0%     1.48k ± 0%  -25.78%  (p=0.000 n=10+10)
GorillaMux_GPlusStatic       9.00 ± 0%      4.00 ± 0%  -55.56%  (p=0.000 n=10+10)
GorillaMux_GPlusParam        10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=10+10)
GorillaMux_GPlus2Params      10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=10+10)
GorillaMux_GPlusAll           128 ± 0%        96 ± 0%  -25.00%  (p=0.000 n=10+10)
GorillaMux_ParseStatic       9.00 ± 0%      4.00 ± 0%  -55.56%  (p=0.000 n=10+10)
GorillaMux_ParseParam        10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=10+10)
GorillaMux_Parse2Params      10.0 ± 0%       8.0 ± 0%  -20.00%  (p=0.000 n=10+10)
GorillaMux_ParseAll           250 ± 0%       168 ± 0%  -32.80%  (p=0.000 n=10+10)
GorillaMux_StaticAll        1.41k ± 0%     0.63k ± 0%  -55.56%  (p=0.000 n=10+10)



If you read this far, please consider running benchmarks for your own use cases
of mux and report back any changes. Thanks!

go.mod override

You can use the following override in your go.mod file:

replace github.com/gorilla/mux v1.8.1 => github.com/das7pad/mux v1.8.1-0.20220803193445-4e593050ec93

Optionally, you can enable the flag for not storing the Route in the request context:

m := mux.NewRouter()
m.OmitRouteFromContext(true)

@amustaque97
Copy link
Contributor

Thanks for the PR @das7pad. hopefully, in the coming weekend, I should take a look at the PR.

@codecov
Copy link

codecov bot commented Aug 17, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (f79c3af) 78.04% compared to head (e136241) 79.06%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #691      +/-   ##
==========================================
+ Coverage   78.04%   79.06%   +1.01%     
==========================================
  Files           5        5              
  Lines         902      917      +15     
==========================================
+ Hits          704      725      +21     
+ Misses        142      136       -6     
  Partials       56       56              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@luisdavim
Copy link

@das7pad looks like this project is active again, would you be able to rebase this PR?

Copy link

@jackgris jackgris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM I think it would be great what @luisdavim says about rebasing this PR.

@das7pad das7pad force-pushed the perf-cut-allocations branch from 367ad61 to 1465de4 Compare September 3, 2023 20:05
@das7pad
Copy link
Contributor Author

das7pad commented Sep 4, 2023

@das7pad looks like this project is active again, would you be able to rebase this PR?

Sure, done :)

Below are the latest benchmark results

mux project benchmarks

You can reproduce these benchmarks using docker, pinned to CPU 1:

$ docker run --rm --pull always -v /logs:/logs --cpuset-cpus 1 -d golang:1.21 bash -exc 'git clone https://github.com/das7pad/mux.git && cd mux && for branch in baseline perf-cut-allocations; do git checkout "$branch" && go test -benchmem -bench . -count 100 -timeout 1h > "/logs/$branch-all.txt"; done; go install golang.org/x/perf/cmd/benchstat@latest; benchstat /logs/baseline-all.txt /logs/perf-cut-allocations-all.txt > /logs/compare-all.txt'
Modern Xeon E, 3.4 GHz
goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz
                                    │ /logs/baseline-all.txt │  /logs/perf-cut-allocations-all.txt  │
                                    │         sec/op         │   sec/op     vs base                 │
Mux                                             1057.0n ± 0%   943.4n ± 0%  -10.75% (n=100)
MuxSimple/default                                659.1n ± 0%   328.2n ± 0%  -50.20% (n=100)
MuxSimple/omit_route_from_ctx                    653.8n ± 0%   158.3n ± 0%  -75.79% (n=100)
MuxAlternativeInRegexp                           1.588µ ± 0%   1.353µ ± 0%  -14.77% (n=100)
ManyPathVariables                                1.543µ ± 0%   1.457µ ± 0%   -5.61% (n=100)
PopulateContext/no_populated_vars                665.6n ± 0%   329.6n ± 0%  -50.48% (n=100)
PopulateContext/empty_var                        929.0n ± 0%   796.1n ± 0%  -14.31% (n=100)
PopulateContext/populated_vars                   959.3n ± 1%   839.4n ± 1%  -12.51% (n=100)
PopulateContext/omit_route_/static               665.0n ± 0%   159.1n ± 1%  -76.07% (n=100)
PopulateContext/omit_route_/dynamic              923.6n ± 0%   758.3n ± 0%  -17.91% (n=100)
_findQueryKey/0                                  143.8n ± 0%   146.2n ± 0%   +1.67% (p=0.000 n=100)
_findQueryKey/1                                  227.2n ± 0%   227.8n ± 0%        ~ (p=0.058 n=100)
_findQueryKey/2                                  796.7n ± 0%   806.3n ± 1%   +1.20% (p=0.000 n=100)
_findQueryKey/3                                  888.2n ± 0%   893.9n ± 0%   +0.65% (p=0.021 n=100)
_findQueryKey/4                                  3.690n ± 0%   3.722n ± 0%   +0.85% (p=0.000 n=100)
_findQueryKeyGoLib/0                             776.5n ± 0%   752.4n ± 0%   -3.10% (n=100)
_findQueryKeyGoLib/1                             405.5n ± 0%   398.3n ± 0%   -1.76% (n=100)
_findQueryKeyGoLib/2                             2.549µ ± 0%   2.513µ ± 0%   -1.39% (p=0.000 n=100)
_findQueryKeyGoLib/3                             3.355µ ± 0%   3.295µ ± 0%   -1.79% (n=100)
_findQueryKeyGoLib/4                             3.569n ± 0%   3.724n ± 0%   +4.34% (p=0.000 n=100)
geomean                                          474.4n        368.4n       -22.34%

                                    │ /logs/baseline-all.txt │   /logs/perf-cut-allocations-all.txt    │
                                    │          B/op          │     B/op      vs base                   │
Mux                                            1024.0 ± 0%       768.0 ± 0%  -25.00% (n=100)
MuxSimple/default                               720.0 ± 0%       352.0 ± 0%  -51.11% (n=100)
MuxSimple/omit_route_from_ctx                  720.00 ± 0%       48.00 ± 0%  -93.33% (n=100)
MuxAlternativeInRegexp                        2.000Ki ± 0%     1.500Ki ± 0%  -25.00% (n=100)
ManyPathVariables                              1230.0 ± 0%       972.0 ± 0%  -20.98% (n=100)
PopulateContext/no_populated_vars               720.0 ± 0%       352.0 ± 0%  -51.11% (n=100)
PopulateContext/empty_var                      1040.0 ± 0%       784.0 ± 0%  -24.62% (n=100)
PopulateContext/populated_vars                 1024.0 ± 0%       768.0 ± 0%  -25.00% (n=100)
PopulateContext/omit_route_/static             720.00 ± 0%       48.00 ± 0%  -93.33% (n=100)
PopulateContext/omit_route_/dynamic            1040.0 ± 0%       736.0 ± 0%  -29.23% (n=100)
_findQueryKey/0                                 0.000 ± 0%       0.000 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/1                                 40.00 ± 0%       40.00 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/2                                 483.0 ± 0%       483.0 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/3                                 543.0 ± 0%       543.0 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/4                                 0.000 ± 0%       0.000 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/0                            864.0 ± 0%       864.0 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/1                            432.0 ± 0%       432.0 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/2                          1.506Ki ± 0%     1.506Ki ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/3                          1.938Ki ± 0%     1.938Ki ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/4                            0.000 ± 0%       0.000 ± 0%        ~ (p=1.000 n=100) ¹
geomean                                                    ²                 -34.86%                 ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                    │ /logs/baseline-all.txt │  /logs/perf-cut-allocations-all.txt   │
                                    │       allocs/op        │ allocs/op   vs base                   │
Mux                                             8.000 ± 0%     7.000 ± 0%  -12.50% (n=100)
MuxSimple/default                               7.000 ± 0%     3.000 ± 0%  -57.14% (n=100)
MuxSimple/omit_route_from_ctx                   7.000 ± 0%     1.000 ± 0%  -85.71% (n=100)
MuxAlternativeInRegexp                          16.00 ± 0%     14.00 ± 0%  -12.50% (n=100)
ManyPathVariables                               12.00 ± 0%     11.00 ± 0%   -8.33% (n=100)
PopulateContext/no_populated_vars               7.000 ± 0%     3.000 ± 0%  -57.14% (n=100)
PopulateContext/empty_var                       9.000 ± 0%     8.000 ± 0%  -11.11% (n=100)
PopulateContext/populated_vars                  8.000 ± 0%     7.000 ± 0%  -12.50% (n=100)
PopulateContext/omit_route_/static              7.000 ± 0%     1.000 ± 0%  -85.71% (n=100)
PopulateContext/omit_route_/dynamic             9.000 ± 0%     7.000 ± 0%  -22.22% (n=100)
_findQueryKey/0                                 0.000 ± 0%     0.000 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/1                                 3.000 ± 0%     3.000 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/2                                 10.00 ± 0%     10.00 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/3                                 11.00 ± 0%     11.00 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/4                                 0.000 ± 0%     0.000 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/0                            8.000 ± 0%     8.000 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/1                            4.000 ± 0%     4.000 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/2                            24.00 ± 0%     24.00 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/3                            28.00 ± 0%     28.00 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/4                            0.000 ± 0%     0.000 ± 0%        ~ (p=1.000 n=100) ¹
geomean                                                    ²               -27.54%                 ²
¹ all samples are equal
² summaries must be >0 to compute geomean

Older Xeon Gold, 2.3 GHz

(Actual CPU identifier is rather Intel(R) Xeon(R) Gold 5122 CPU @ 2.30GHz)

goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel Xeon Processor (Skylake, IBRS)
                                    │ /logs/baseline-all.txt │  /logs/perf-cut-allocations-all.txt  │
                                    │         sec/op         │   sec/op     vs base                 │
Mux                                              2.615µ ± 2%   2.408µ ± 3%   -7.93% (p=0.000 n=100)
MuxSimple/default                               1600.5n ± 3%   798.9n ± 2%  -50.08% (n=100)
MuxSimple/omit_route_from_ctx                   1549.5n ± 2%   399.8n ± 2%  -74.20% (n=100)
MuxAlternativeInRegexp                           3.703µ ± 2%   3.187µ ± 3%  -13.95% (n=100)
ManyPathVariables                                3.924µ ± 4%   3.591µ ± 3%   -8.50% (p=0.000 n=100)
PopulateContext/no_populated_vars               1572.5n ± 2%   825.7n ± 3%  -47.49% (n=100)
PopulateContext/empty_var                        2.237µ ± 2%   1.871µ ± 3%  -16.36% (n=100)
PopulateContext/populated_vars                   2.352µ ± 3%   1.973µ ± 4%  -16.10% (n=100)
PopulateContext/omit_route_/static              1517.5n ± 3%   394.6n ± 4%  -74.00% (n=100)
PopulateContext/omit_route_/dynamic              2.182µ ± 3%   1.776µ ± 3%  -18.63% (n=100)
_findQueryKey/0                                  337.6n ± 3%   351.6n ± 3%   +4.12% (p=0.005 n=100)
_findQueryKey/1                                  462.6n ± 1%   467.6n ± 2%        ~ (p=0.660 n=100)
_findQueryKey/2                                  1.681µ ± 2%   1.709µ ± 1%        ~ (p=0.093 n=100)
_findQueryKey/3                                  1.833µ ± 1%   1.869µ ± 2%   +1.94% (p=0.011 n=100)
_findQueryKey/4                                  9.438n ± 4%   8.826n ± 2%   -6.48% (p=0.000 n=100)
_findQueryKeyGoLib/0                             1.736µ ± 2%   1.790µ ± 3%   +3.11% (p=0.000 n=100)
_findQueryKeyGoLib/1                             876.6n ± 1%   865.2n ± 2%        ~ (p=0.156 n=100)
_findQueryKeyGoLib/2                             5.473µ ± 1%   5.481µ ± 2%        ~ (p=0.663 n=100)
_findQueryKeyGoLib/3                             7.538µ ± 1%   7.252µ ± 2%   -3.79% (p=0.000 n=100)
_findQueryKeyGoLib/4                             8.777n ± 2%   9.817n ± 4%  +11.86% (p=0.000 n=100)
geomean                                          1.098µ        863.1n       -21.38%

                                    │ /logs/baseline-all.txt │   /logs/perf-cut-allocations-all.txt    │
                                    │          B/op          │     B/op      vs base                   │
Mux                                            1024.0 ± 0%       768.0 ± 0%  -25.00% (n=100)
MuxSimple/default                               720.0 ± 0%       352.0 ± 0%  -51.11% (n=100)
MuxSimple/omit_route_from_ctx                  720.00 ± 0%       48.00 ± 0%  -93.33% (n=100)
MuxAlternativeInRegexp                        2.000Ki ± 0%     1.500Ki ± 0%  -25.00% (n=100)
ManyPathVariables                              1235.0 ± 0%       980.0 ± 0%  -20.65% (n=100)
PopulateContext/no_populated_vars               720.0 ± 0%       352.0 ± 0%  -51.11% (n=100)
PopulateContext/empty_var                      1040.0 ± 0%       784.0 ± 0%  -24.62% (n=100)
PopulateContext/populated_vars                 1024.0 ± 0%       768.0 ± 0%  -25.00% (n=100)
PopulateContext/omit_route_/static             720.00 ± 0%       48.00 ± 0%  -93.33% (n=100)
PopulateContext/omit_route_/dynamic            1040.0 ± 0%       736.0 ± 0%  -29.23% (n=100)
_findQueryKey/0                                 0.000 ± 0%       0.000 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/1                                 40.00 ± 0%       40.00 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/2                                 483.0 ± 0%       483.0 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/3                                 543.0 ± 0%       543.0 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/4                                 0.000 ± 0%       0.000 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/0                            864.0 ± 0%       864.0 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/1                            432.0 ± 0%       432.0 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/2                          1.506Ki ± 0%     1.506Ki ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/3                          1.938Ki ± 0%     1.938Ki ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/4                            0.000 ± 0%       0.000 ± 0%        ~ (p=1.000 n=100) ¹
geomean                                                    ²                 -34.85%                 ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                    │ /logs/baseline-all.txt │  /logs/perf-cut-allocations-all.txt   │
                                    │       allocs/op        │ allocs/op   vs base                   │
Mux                                             8.000 ± 0%     7.000 ± 0%  -12.50% (n=100)
MuxSimple/default                               7.000 ± 0%     3.000 ± 0%  -57.14% (n=100)
MuxSimple/omit_route_from_ctx                   7.000 ± 0%     1.000 ± 0%  -85.71% (n=100)
MuxAlternativeInRegexp                          16.00 ± 0%     14.00 ± 0%  -12.50% (n=100)
ManyPathVariables                               12.00 ± 0%     11.00 ± 0%   -8.33% (n=100)
PopulateContext/no_populated_vars               7.000 ± 0%     3.000 ± 0%  -57.14% (n=100)
PopulateContext/empty_var                       9.000 ± 0%     8.000 ± 0%  -11.11% (n=100)
PopulateContext/populated_vars                  8.000 ± 0%     7.000 ± 0%  -12.50% (n=100)
PopulateContext/omit_route_/static              7.000 ± 0%     1.000 ± 0%  -85.71% (n=100)
PopulateContext/omit_route_/dynamic             9.000 ± 0%     7.000 ± 0%  -22.22% (n=100)
_findQueryKey/0                                 0.000 ± 0%     0.000 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/1                                 3.000 ± 0%     3.000 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/2                                 10.00 ± 0%     10.00 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/3                                 11.00 ± 0%     11.00 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/4                                 0.000 ± 0%     0.000 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/0                            8.000 ± 0%     8.000 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/1                            4.000 ± 0%     4.000 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/2                            24.00 ± 0%     24.00 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/3                            28.00 ± 0%     28.00 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/4                            0.000 ± 0%     0.000 ± 0%        ~ (p=1.000 n=100) ¹
geomean                                                    ²               -27.54%                 ²
¹ all samples are equal
² summaries must be >0 to compute geomean

Older Xeon E, 2.4 GHz
goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
                                    │ /logs/baseline-all.txt │  /logs/perf-cut-allocations-all.txt  │
                                    │         sec/op         │   sec/op     vs base                 │
Mux                                              1.698µ ± 1%   1.524µ ± 1%  -10.25% (n=100)
MuxSimple/default                               1031.5n ± 1%   523.9n ± 1%  -49.21% (n=100)
MuxSimple/omit_route_from_ctx                   1023.0n ± 0%   262.9n ± 0%  -74.31% (n=100)
MuxAlternativeInRegexp                           2.463µ ± 0%   2.143µ ± 1%  -13.01% (n=100)
ManyPathVariables                                2.619µ ± 1%   2.401µ ± 1%   -8.30% (n=100)
PopulateContext/no_populated_vars               1025.0n ± 0%   516.4n ± 0%  -49.62% (n=100)
PopulateContext/empty_var                        1.421µ ± 0%   1.257µ ± 1%  -11.58% (n=100)
PopulateContext/populated_vars                   1.481µ ± 1%   1.298µ ± 1%  -12.33% (n=100)
PopulateContext/omit_route_/static              1026.0n ± 0%   255.3n ± 1%  -75.12% (n=100)
PopulateContext/omit_route_/dynamic              1.427µ ± 1%   1.154µ ± 1%  -19.17% (n=100)
_findQueryKey/0                                  234.3n ± 0%   235.2n ± 0%        ~ (p=0.057 n=100)
_findQueryKey/1                                  373.7n ± 0%   372.4n ± 0%        ~ (p=0.317 n=100)
_findQueryKey/2                                  1.310µ ± 1%   1.308µ ± 1%        ~ (p=0.772 n=100)
_findQueryKey/3                                  1.469µ ± 1%   1.482µ ± 1%        ~ (p=0.097 n=100)
_findQueryKey/4                                  6.343n ± 0%   6.304n ± 0%   -0.61% (p=0.000 n=100)
_findQueryKeyGoLib/0                             1.180µ ± 1%   1.181µ ± 1%        ~ (p=0.674 n=100)
_findQueryKeyGoLib/1                             631.4n ± 1%   624.8n ± 1%   -1.05% (p=0.000 n=100)
_findQueryKeyGoLib/2                             4.064µ ± 1%   4.064µ ± 1%        ~ (p=0.717 n=100)
_findQueryKeyGoLib/3                             5.479µ ± 1%   5.494µ ± 0%        ~ (p=0.535 n=100)
_findQueryKeyGoLib/4                             6.323n ± 0%   6.352n ± 0%   +0.46% (p=0.015 n=100)
geomean                                          759.6n        594.1n       -21.79%

                                    │ /logs/baseline-all.txt │   /logs/perf-cut-allocations-all.txt    │
                                    │          B/op          │     B/op      vs base                   │
Mux                                            1024.0 ± 0%       768.0 ± 0%  -25.00% (n=100)
MuxSimple/default                               720.0 ± 0%       352.0 ± 0%  -51.11% (n=100)
MuxSimple/omit_route_from_ctx                  720.00 ± 0%       48.00 ± 0%  -93.33% (n=100)
MuxAlternativeInRegexp                        2.000Ki ± 0%     1.500Ki ± 0%  -25.00% (n=100)
ManyPathVariables                              1254.5 ± 2%       998.0 ± 0%  -20.45% (n=100)
PopulateContext/no_populated_vars               720.0 ± 0%       352.0 ± 0%  -51.11% (n=100)
PopulateContext/empty_var                      1040.0 ± 0%       784.0 ± 0%  -24.62% (n=100)
PopulateContext/populated_vars                 1024.0 ± 0%       768.0 ± 0%  -25.00% (n=100)
PopulateContext/omit_route_/static             720.00 ± 0%       48.00 ± 0%  -93.33% (n=100)
PopulateContext/omit_route_/dynamic            1040.0 ± 0%       736.0 ± 0%  -29.23% (n=100)
_findQueryKey/0                                 0.000 ± 0%       0.000 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/1                                 40.00 ± 0%       40.00 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/2                                 483.0 ± 0%       483.0 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/3                                 543.0 ± 0%       543.0 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/4                                 0.000 ± 0%       0.000 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/0                            864.0 ± 0%       864.0 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/1                            432.0 ± 0%       432.0 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/2                          1.506Ki ± 0%     1.506Ki ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/3                          1.938Ki ± 0%     1.938Ki ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/4                            0.000 ± 0%       0.000 ± 0%        ~ (p=1.000 n=100) ¹
geomean                                                    ²                 -34.84%                 ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                    │ /logs/baseline-all.txt │  /logs/perf-cut-allocations-all.txt   │
                                    │       allocs/op        │ allocs/op   vs base                   │
Mux                                             8.000 ± 0%     7.000 ± 0%  -12.50% (n=100)
MuxSimple/default                               7.000 ± 0%     3.000 ± 0%  -57.14% (n=100)
MuxSimple/omit_route_from_ctx                   7.000 ± 0%     1.000 ± 0%  -85.71% (n=100)
MuxAlternativeInRegexp                          16.00 ± 0%     14.00 ± 0%  -12.50% (n=100)
ManyPathVariables                               12.00 ± 0%     11.00 ± 0%   -8.33% (n=100)
PopulateContext/no_populated_vars               7.000 ± 0%     3.000 ± 0%  -57.14% (n=100)
PopulateContext/empty_var                       9.000 ± 0%     8.000 ± 0%  -11.11% (n=100)
PopulateContext/populated_vars                  8.000 ± 0%     7.000 ± 0%  -12.50% (n=100)
PopulateContext/omit_route_/static              7.000 ± 0%     1.000 ± 0%  -85.71% (n=100)
PopulateContext/omit_route_/dynamic             9.000 ± 0%     7.000 ± 0%  -22.22% (n=100)
_findQueryKey/0                                 0.000 ± 0%     0.000 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/1                                 3.000 ± 0%     3.000 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/2                                 10.00 ± 0%     10.00 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/3                                 11.00 ± 0%     11.00 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKey/4                                 0.000 ± 0%     0.000 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/0                            8.000 ± 0%     8.000 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/1                            4.000 ± 0%     4.000 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/2                            24.00 ± 0%     24.00 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/3                            28.00 ± 0%     28.00 ± 0%        ~ (p=1.000 n=100) ¹
_findQueryKeyGoLib/4                            0.000 ± 0%     0.000 ± 0%        ~ (p=1.000 n=100) ¹
geomean                                                    ²               -27.54%                 ²
¹ all samples are equal
² summaries must be >0 to compute geomean


Popular go-http-routing-benchmark

I pushed three branches for comparison to my fork:
https://github.com/das7pad/go-http-routing-benchmark

  • before, this is the baseline branch mentioned above
  • after, this is the PR revision
  • after-omit-route, like after with the OmitRouteFromContext flag enabled

You can reproduce these benchmarks using docker, pinned to CPU 1:

docker run --rm --pull always -v /logs:/logs --cpuset-cpus 1 -d golang:1.18 bash -exc 'git clone https://github.com/das7pad/go-http-routing-benchmark.git && cd go-http-routing-benchmark && for branch in before after after-omit-route; do git checkout "$branch" && go test -benchmem -bench Gorilla -count 100 -timeout 1h > "/logs/$branch.txt"; done; go install golang.org/x/perf/cmd/benchstat@latest; benchstat /logs/before.txt /logs/after.txt > /logs/compare-before-vs-after.txt; benchstat /logs/before.txt /logs/after-omit-route.txt > /logs/compare-before-vs-after-omit-route.txt'
Modern Xeon E, 3.4 GHz

Before vs After with omit Route flag enabled

goos: linux
goarch: amd64
pkg: github.com/julienschmidt/go-http-routing-benchmark
cpu: Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz
                        │ /logs/before.txt │  /logs/after-omit-route.txt  │
                        │      sec/op      │   sec/op     vs base         │
GorillaMux_Param               1.455µ ± 0%   1.116µ ± 0%  -23.27% (n=100)
GorillaMux_Param5              2.200µ ± 0%   1.864µ ± 0%  -15.28% (n=100)
GorillaMux_Param20             5.438µ ± 0%   3.575µ ± 0%  -34.26% (n=100)
GorillaMux_ParamWrite          1.496µ ± 0%   1.168µ ± 0%  -21.93% (n=100)
GorillaMux_GithubStatic        3.315µ ± 0%   2.384µ ± 0%  -28.08% (n=100)
GorillaMux_GithubParam         5.021µ ± 0%   4.729µ ± 0%   -5.83% (n=100)
GorillaMux_GithubAll           2.620m ± 0%   2.514m ± 0%   -4.03% (n=100)
GorillaMux_GPlusStatic         973.9n ± 0%   187.0n ± 0%  -80.80% (n=100)
GorillaMux_GPlusParam          1.915µ ± 0%   1.625µ ± 0%  -15.15% (n=100)
GorillaMux_GPlus2Params        3.786µ ± 0%   3.507µ ± 0%   -7.37% (n=100)
GorillaMux_GPlusAll            30.20µ ± 0%   24.71µ ± 0%  -18.16% (n=100)
GorillaMux_ParseStatic        1199.0n ± 0%   398.4n ± 0%  -66.77% (n=100)
GorillaMux_ParseParam          1.477µ ± 0%   1.166µ ± 0%  -21.06% (n=100)
GorillaMux_Parse2Params        1.795µ ± 0%   1.491µ ± 0%  -16.94% (n=100)
GorillaMux_ParseAll            59.11µ ± 0%   45.33µ ± 0%  -23.32% (n=100)
GorillaMux_StaticAll           662.8µ ± 1%   491.1µ ± 0%  -25.91% (n=100)
geomean                        6.957µ        4.869µ       -30.01%

                        │ /logs/before.txt │  /logs/after-omit-route.txt   │
                        │       B/op       │     B/op      vs base         │
GorillaMux_Param               1024.0 ± 0%     720.0 ± 0%  -29.69% (n=100)
GorillaMux_Param5              1088.0 ± 0%     784.0 ± 0%  -27.94% (n=100)
GorillaMux_Param20            3.087Ki ± 0%   1.905Ki ± 0%  -38.28% (n=100)
GorillaMux_ParamWrite          1024.0 ± 0%     720.0 ± 0%  -29.69% (n=100)
GorillaMux_GithubStatic        720.00 ± 0%     48.00 ± 0%  -93.33% (n=100)
GorillaMux_GithubParam         1040.0 ± 0%     736.0 ± 0%  -29.23% (n=100)
GorillaMux_GithubAll          195.0Ki ± 0%   121.8Ki ± 0%  -37.54% (n=100)
GorillaMux_GPlusStatic         720.00 ± 0%     48.00 ± 0%  -93.33% (n=100)
GorillaMux_GPlusParam          1024.0 ± 0%     720.0 ± 0%  -29.69% (n=100)
GorillaMux_GPlus2Params        1040.0 ± 0%     736.0 ± 0%  -29.23% (n=100)
GorillaMux_GPlusAll          12.484Ki ± 0%   7.906Ki ± 0%  -36.67% (n=100)
GorillaMux_ParseStatic         720.00 ± 0%     48.00 ± 0%  -93.33% (n=100)
GorillaMux_ParseParam          1024.0 ± 0%     720.0 ± 0%  -29.69% (n=100)
GorillaMux_Parse2Params        1040.0 ± 0%     736.0 ± 0%  -29.23% (n=100)
GorillaMux_ParseAll           23.08Ki ± 0%   11.77Ki ± 0%  -49.02% (n=100)
GorillaMux_StaticAll        110.395Ki ± 0%   7.359Ki ± 0%  -93.33% (n=100)
geomean                       2.688Ki        1.008Ki       -62.49%

                        │ /logs/before.txt │  /logs/after-omit-route.txt  │
                        │    allocs/op     │  allocs/op   vs base         │
GorillaMux_Param                8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_Param5               8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_Param20             10.000 ± 0%    6.000 ± 0%  -40.00% (n=100)
GorillaMux_ParamWrite           8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_GithubStatic         7.000 ± 0%    1.000 ± 0%  -85.71% (n=100)
GorillaMux_GithubParam          8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_GithubAll           1.588k ± 0%   1.038k ± 0%  -34.63% (n=100)
GorillaMux_GPlusStatic          7.000 ± 0%    1.000 ± 0%  -85.71% (n=100)
GorillaMux_GPlusParam           8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_GPlus2Params         8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_GPlusAll            102.00 ± 0%    68.00 ± 0%  -33.33% (n=100)
GorillaMux_ParseStatic          7.000 ± 0%    1.000 ± 0%  -85.71% (n=100)
GorillaMux_ParseParam           8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_Parse2Params         8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_ParseAll             198.0 ± 0%    106.0 ± 0%  -46.46% (n=100)
GorillaMux_StaticAll           1099.0 ± 0%    157.0 ± 0%  -85.71% (n=100)
geomean                         21.46         10.11       -52.91%

Before vs After

goos: linux
goarch: amd64
pkg: github.com/julienschmidt/go-http-routing-benchmark
cpu: Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz
                        │ /logs/before.txt │       /logs/after.txt        │
                        │      sec/op      │   sec/op     vs base         │
GorillaMux_Param               1.455µ ± 0%   1.199µ ± 0%  -17.60% (n=100)
GorillaMux_Param5              2.200µ ± 0%   1.953µ ± 0%  -11.21% (n=100)
GorillaMux_Param20             5.438µ ± 0%   3.667µ ± 0%  -32.57% (n=100)
GorillaMux_ParamWrite          1.496µ ± 0%   1.264µ ± 0%  -15.51% (n=100)
GorillaMux_GithubStatic        3.315µ ± 0%   2.809µ ± 0%  -15.26% (n=100)
GorillaMux_GithubParam         5.021µ ± 0%   4.782µ ± 0%   -4.76% (n=100)
GorillaMux_GithubAll           2.620m ± 0%   2.510m ± 1%   -4.20% (n=100)
GorillaMux_GPlusStatic         973.9n ± 0%   489.2n ± 1%  -49.76% (n=100)
GorillaMux_GPlusParam          1.915µ ± 0%   1.707µ ± 0%  -10.86% (n=100)
GorillaMux_GPlus2Params        3.786µ ± 0%   3.575µ ± 0%   -5.57% (n=100)
GorillaMux_GPlusAll            30.20µ ± 0%   26.44µ ± 0%  -12.45% (n=100)
GorillaMux_ParseStatic        1199.0n ± 0%   707.7n ± 0%  -40.98% (n=100)
GorillaMux_ParseParam          1.477µ ± 0%   1.238µ ± 0%  -16.19% (n=100)
GorillaMux_Parse2Params        1.795µ ± 0%   1.565µ ± 0%  -12.81% (n=100)
GorillaMux_ParseAll            59.11µ ± 0%   49.76µ ± 0%  -15.82% (n=100)
GorillaMux_StaticAll           662.8µ ± 1%   577.4µ ± 0%  -12.89% (n=100)
geomean                        6.957µ        5.669µ       -18.52%

                        │ /logs/before.txt │        /logs/after.txt        │
                        │       B/op       │     B/op      vs base         │
GorillaMux_Param               1024.0 ± 0%     768.0 ± 0%  -25.00% (n=100)
GorillaMux_Param5              1088.0 ± 0%     832.0 ± 0%  -23.53% (n=100)
GorillaMux_Param20            3.087Ki ± 0%   1.952Ki ± 0%  -36.76% (n=100)
GorillaMux_ParamWrite          1024.0 ± 0%     768.0 ± 0%  -25.00% (n=100)
GorillaMux_GithubStatic         720.0 ± 0%     352.0 ± 0%  -51.11% (n=100)
GorillaMux_GithubParam         1040.0 ± 0%     784.0 ± 0%  -24.62% (n=100)
GorillaMux_GithubAll          195.0Ki ± 0%   140.3Ki ± 0%  -28.04% (n=100)
GorillaMux_GPlusStatic          720.0 ± 0%     352.0 ± 0%  -51.11% (n=100)
GorillaMux_GPlusParam          1024.0 ± 0%     768.0 ± 0%  -25.00% (n=100)
GorillaMux_GPlus2Params        1040.0 ± 0%     784.0 ± 0%  -24.62% (n=100)
GorillaMux_GPlusAll          12.484Ki ± 0%   9.016Ki ± 0%  -27.78% (n=100)
GorillaMux_ParseStatic          720.0 ± 0%     352.0 ± 0%  -51.11% (n=100)
GorillaMux_ParseParam          1024.0 ± 0%     768.0 ± 0%  -25.00% (n=100)
GorillaMux_Parse2Params        1040.0 ± 0%     784.0 ± 0%  -24.62% (n=100)
GorillaMux_ParseAll           23.08Ki ± 0%   15.48Ki ± 0%  -32.90% (n=100)
GorillaMux_StaticAll         110.39Ki ± 0%   53.97Ki ± 0%  -51.11% (n=100)
geomean                       2.688Ki        1.775Ki       -33.97%

                        │ /logs/before.txt │       /logs/after.txt        │
                        │    allocs/op     │  allocs/op   vs base         │
GorillaMux_Param                8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_Param5               8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_Param20             10.000 ± 0%    7.000 ± 0%  -30.00% (n=100)
GorillaMux_ParamWrite           8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_GithubStatic         7.000 ± 0%    3.000 ± 0%  -57.14% (n=100)
GorillaMux_GithubParam          8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_GithubAll           1.588k ± 0%   1.277k ± 0%  -19.58% (n=100)
GorillaMux_GPlusStatic          7.000 ± 0%    3.000 ± 0%  -57.14% (n=100)
GorillaMux_GPlusParam           8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_GPlus2Params         8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_GPlusAll            102.00 ± 0%    83.00 ± 0%  -18.63% (n=100)
GorillaMux_ParseStatic          7.000 ± 0%    3.000 ± 0%  -57.14% (n=100)
GorillaMux_ParseParam           8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_Parse2Params         8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_ParseAll             198.0 ± 0%    142.0 ± 0%  -28.28% (n=100)
GorillaMux_StaticAll           1099.0 ± 0%    471.0 ± 0%  -57.14% (n=100)
geomean                         21.46         15.15       -29.40%

Older Xeon Gold, 2.3 GHz

Before vs After with omit Route flag enabled

(Actual CPU identifier is rather Intel(R) Xeon(R) Gold 5122 CPU @ 2.30GHz)

goos: linux
goarch: amd64
pkg: github.com/julienschmidt/go-http-routing-benchmark
cpu: Intel Xeon Processor (Skylake, IBRS)
                        │ /logs/before.txt │      /logs/after-omit-route.txt      │
                        │      sec/op      │   sec/op     vs base                 │
GorillaMux_Param               3.345µ ± 3%   2.649µ ± 3%  -20.80% (n=100)
GorillaMux_Param5              5.206µ ± 2%   5.023µ ± 2%   -3.52% (p=0.046 n=100)
GorillaMux_Param20            12.866µ ± 2%   9.010µ ± 3%  -29.97% (n=100)
GorillaMux_ParamWrite          3.450µ ± 4%   2.999µ ± 2%  -13.06% (n=100)
GorillaMux_GithubStatic        7.957µ ± 2%   5.759µ ± 3%  -27.62% (n=100)
GorillaMux_GithubParam         12.54µ ± 1%   11.91µ ± 3%   -4.97% (p=0.000 n=100)
GorillaMux_GithubAll           6.019m ± 3%   5.616m ± 3%   -6.69% (p=0.000 n=100)
GorillaMux_GPlusStatic        2277.5n ± 1%   479.2n ± 3%  -78.96% (n=100)
GorillaMux_GPlusParam          5.246µ ± 2%   3.969µ ± 2%  -24.34% (n=100)
GorillaMux_GPlus2Params       10.060µ ± 3%   9.068µ ± 3%   -9.86% (p=0.000 n=100)
GorillaMux_GPlusAll            71.76µ ± 2%   58.00µ ± 3%  -19.18% (n=100)
GorillaMux_ParseStatic         2.753µ ± 3%   1.020µ ± 5%  -62.95% (n=100)
GorillaMux_ParseParam          3.557µ ± 2%   2.825µ ± 2%  -20.59% (n=100)
GorillaMux_Parse2Params        4.501µ ± 3%   3.750µ ± 2%  -16.68% (n=100)
GorillaMux_ParseAll            142.6µ ± 3%   111.9µ ± 3%  -21.50% (n=100)
GorillaMux_StaticAll           1.545m ± 2%   1.177m ± 3%  -23.80% (n=100)
geomean                        16.76µ        12.05µ       -28.12%

                        │ /logs/before.txt │  /logs/after-omit-route.txt   │
                        │       B/op       │     B/op      vs base         │
GorillaMux_Param               1024.0 ± 0%     720.0 ± 0%  -29.69% (n=100)
GorillaMux_Param5              1088.0 ± 0%     784.0 ± 0%  -27.94% (n=100)
GorillaMux_Param20            3.087Ki ± 0%   1.905Ki ± 0%  -38.28% (n=100)
GorillaMux_ParamWrite          1024.0 ± 0%     720.0 ± 0%  -29.69% (n=100)
GorillaMux_GithubStatic        720.00 ± 0%     48.00 ± 0%  -93.33% (n=100)
GorillaMux_GithubParam         1040.0 ± 0%     736.0 ± 0%  -29.23% (n=100)
GorillaMux_GithubAll          195.0Ki ± 0%   121.8Ki ± 0%  -37.54% (n=100)
GorillaMux_GPlusStatic         720.00 ± 0%     48.00 ± 0%  -93.33% (n=100)
GorillaMux_GPlusParam          1024.0 ± 0%     720.0 ± 0%  -29.69% (n=100)
GorillaMux_GPlus2Params        1040.0 ± 0%     736.0 ± 0%  -29.23% (n=100)
GorillaMux_GPlusAll          12.484Ki ± 0%   7.906Ki ± 0%  -36.67% (n=100)
GorillaMux_ParseStatic         720.00 ± 0%     48.00 ± 0%  -93.33% (n=100)
GorillaMux_ParseParam          1024.0 ± 0%     720.0 ± 0%  -29.69% (n=100)
GorillaMux_Parse2Params        1040.0 ± 0%     736.0 ± 0%  -29.23% (n=100)
GorillaMux_ParseAll           23.08Ki ± 0%   11.77Ki ± 0%  -49.02% (n=100)
GorillaMux_StaticAll        110.394Ki ± 0%   7.359Ki ± 0%  -93.33% (n=100)
geomean                       2.688Ki        1.008Ki       -62.49%

                        │ /logs/before.txt │  /logs/after-omit-route.txt  │
                        │    allocs/op     │  allocs/op   vs base         │
GorillaMux_Param                8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_Param5               8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_Param20             10.000 ± 0%    6.000 ± 0%  -40.00% (n=100)
GorillaMux_ParamWrite           8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_GithubStatic         7.000 ± 0%    1.000 ± 0%  -85.71% (n=100)
GorillaMux_GithubParam          8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_GithubAll           1.588k ± 0%   1.038k ± 0%  -34.63% (n=100)
GorillaMux_GPlusStatic          7.000 ± 0%    1.000 ± 0%  -85.71% (n=100)
GorillaMux_GPlusParam           8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_GPlus2Params         8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_GPlusAll            102.00 ± 0%    68.00 ± 0%  -33.33% (n=100)
GorillaMux_ParseStatic          7.000 ± 0%    1.000 ± 0%  -85.71% (n=100)
GorillaMux_ParseParam           8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_Parse2Params         8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_ParseAll             198.0 ± 0%    106.0 ± 0%  -46.46% (n=100)
GorillaMux_StaticAll           1099.0 ± 0%    157.0 ± 0%  -85.71% (n=100)
geomean                         21.46         10.11       -52.91%

Before vs After

(Actual CPU identifier is rather Intel(R) Xeon(R) Gold 5122 CPU @ 2.30GHz)

goos: linux
goarch: amd64
pkg: github.com/julienschmidt/go-http-routing-benchmark
cpu: Intel Xeon Processor (Skylake, IBRS)
                        │ /logs/before.txt │           /logs/after.txt            │
                        │      sec/op      │   sec/op     vs base                 │
GorillaMux_Param               3.345µ ± 3%   2.933µ ± 3%  -12.29% (n=100)
GorillaMux_Param5              5.206µ ± 2%   4.460µ ± 4%  -14.33% (n=100)
GorillaMux_Param20            12.866µ ± 2%   9.037µ ± 4%  -29.76% (n=100)
GorillaMux_ParamWrite          3.450µ ± 4%   3.323µ ± 2%   -3.68% (p=0.023 n=100)
GorillaMux_GithubStatic        7.957µ ± 2%   6.610µ ± 3%  -16.93% (n=100)
GorillaMux_GithubParam         12.54µ ± 1%   11.76µ ± 5%   -6.17% (p=0.000 n=100)
GorillaMux_GithubAll           6.019m ± 3%   5.687m ± 3%   -5.52% (p=0.000 n=100)
GorillaMux_GPlusStatic         2.277µ ± 1%   1.144µ ± 2%  -49.77% (n=100)
GorillaMux_GPlusParam          5.246µ ± 2%   4.133µ ± 3%  -21.23% (n=100)
GorillaMux_GPlus2Params       10.060µ ± 3%   9.003µ ± 2%  -10.50% (p=0.000 n=100)
GorillaMux_GPlusAll            71.76µ ± 2%   64.25µ ± 3%  -10.46% (p=0.000 n=100)
GorillaMux_ParseStatic         2.753µ ± 3%   1.715µ ± 2%  -37.70% (n=100)
GorillaMux_ParseParam          3.557µ ± 2%   2.958µ ± 2%  -16.84% (n=100)
GorillaMux_Parse2Params        4.501µ ± 3%   4.009µ ± 2%  -10.92% (p=0.000 n=100)
GorillaMux_ParseAll            142.6µ ± 3%   121.4µ ± 3%  -14.90% (n=100)
GorillaMux_StaticAll           1.545m ± 2%   1.301m ± 2%  -15.80% (n=100)
geomean                        16.76µ        13.69µ       -18.32%

                        │ /logs/before.txt │        /logs/after.txt        │
                        │       B/op       │     B/op      vs base         │
GorillaMux_Param               1024.0 ± 0%     768.0 ± 0%  -25.00% (n=100)
GorillaMux_Param5              1088.0 ± 0%     832.0 ± 0%  -23.53% (n=100)
GorillaMux_Param20            3.087Ki ± 0%   1.952Ki ± 0%  -36.76% (n=100)
GorillaMux_ParamWrite          1024.0 ± 0%     768.0 ± 0%  -25.00% (n=100)
GorillaMux_GithubStatic         720.0 ± 0%     352.0 ± 0%  -51.11% (n=100)
GorillaMux_GithubParam         1040.0 ± 0%     784.0 ± 0%  -24.62% (n=100)
GorillaMux_GithubAll          195.0Ki ± 0%   140.3Ki ± 0%  -28.04% (n=100)
GorillaMux_GPlusStatic          720.0 ± 0%     352.0 ± 0%  -51.11% (n=100)
GorillaMux_GPlusParam          1024.0 ± 0%     768.0 ± 0%  -25.00% (n=100)
GorillaMux_GPlus2Params        1040.0 ± 0%     784.0 ± 0%  -24.62% (n=100)
GorillaMux_GPlusAll          12.484Ki ± 0%   9.016Ki ± 0%  -27.78% (n=100)
GorillaMux_ParseStatic          720.0 ± 0%     352.0 ± 0%  -51.11% (n=100)
GorillaMux_ParseParam          1024.0 ± 0%     768.0 ± 0%  -25.00% (n=100)
GorillaMux_Parse2Params        1040.0 ± 0%     784.0 ± 0%  -24.62% (n=100)
GorillaMux_ParseAll           23.08Ki ± 0%   15.48Ki ± 0%  -32.90% (n=100)
GorillaMux_StaticAll         110.39Ki ± 0%   53.97Ki ± 0%  -51.11% (n=100)
geomean                       2.688Ki        1.775Ki       -33.97%

                        │ /logs/before.txt │       /logs/after.txt        │
                        │    allocs/op     │  allocs/op   vs base         │
GorillaMux_Param                8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_Param5               8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_Param20             10.000 ± 0%    7.000 ± 0%  -30.00% (n=100)
GorillaMux_ParamWrite           8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_GithubStatic         7.000 ± 0%    3.000 ± 0%  -57.14% (n=100)
GorillaMux_GithubParam          8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_GithubAll           1.588k ± 0%   1.277k ± 0%  -19.58% (n=100)
GorillaMux_GPlusStatic          7.000 ± 0%    3.000 ± 0%  -57.14% (n=100)
GorillaMux_GPlusParam           8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_GPlus2Params         8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_GPlusAll            102.00 ± 0%    83.00 ± 0%  -18.63% (n=100)
GorillaMux_ParseStatic          7.000 ± 0%    3.000 ± 0%  -57.14% (n=100)
GorillaMux_ParseParam           8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_Parse2Params         8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_ParseAll             198.0 ± 0%    142.0 ± 0%  -28.28% (n=100)
GorillaMux_StaticAll           1099.0 ± 0%    471.0 ± 0%  -57.14% (n=100)
geomean                         21.46         15.15       -29.40%

Older Xeon E, 2.4 GHz

Before vs After with omit Route flag enabled

goos: linux
goarch: amd64
pkg: github.com/julienschmidt/go-http-routing-benchmark
cpu: Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
                        │ /logs/before.txt │      /logs/after-omit-route.txt      │
                        │      sec/op      │   sec/op     vs base                 │
GorillaMux_Param               2.429µ ± 1%   2.092µ ± 1%  -13.84% (n=100)
GorillaMux_Param5              3.664µ ± 1%   3.230µ ± 1%  -11.85% (n=100)
GorillaMux_Param20             9.108µ ± 1%   6.534µ ± 1%  -28.27% (n=100)
GorillaMux_ParamWrite          2.477µ ± 1%   2.165µ ± 1%  -12.60% (n=100)
GorillaMux_GithubStatic        5.673µ ± 1%   3.775µ ± 1%  -33.47% (n=100)
GorillaMux_GithubParam         8.213µ ± 0%   7.861µ ± 1%   -4.29% (n=100)
GorillaMux_GithubAll           4.464m ± 1%   4.359m ± 1%   -2.34% (p=0.000 n=100)
GorillaMux_GPlusStatic        1712.0n ± 1%   334.9n ± 1%  -80.44% (n=100)
GorillaMux_GPlusParam          3.270µ ± 1%   2.849µ ± 1%  -12.86% (n=100)
GorillaMux_GPlus2Params        6.418µ ± 1%   5.953µ ± 0%   -7.25% (n=100)
GorillaMux_GPlusAll            50.79µ ± 1%   43.23µ ± 1%  -14.88% (n=100)
GorillaMux_ParseStatic        2065.0n ± 1%   662.1n ± 1%  -67.94% (n=100)
GorillaMux_ParseParam          2.544µ ± 1%   2.152µ ± 1%  -15.41% (n=100)
GorillaMux_Parse2Params        3.062µ ± 1%   2.691µ ± 1%  -12.15% (n=100)
GorillaMux_ParseAll           102.66µ ± 2%   77.35µ ± 1%  -24.66% (n=100)
GorillaMux_StaticAll          1139.7µ ± 1%   757.3µ ± 1%  -33.55% (n=100)
geomean                        11.81µ        8.455µ       -28.42%

                        │ /logs/before.txt │  /logs/after-omit-route.txt   │
                        │       B/op       │     B/op      vs base         │
GorillaMux_Param               1024.0 ± 0%     720.0 ± 0%  -29.69% (n=100)
GorillaMux_Param5              1088.0 ± 0%     784.0 ± 0%  -27.94% (n=100)
GorillaMux_Param20            3.087Ki ± 0%   1.905Ki ± 0%  -38.28% (n=100)
GorillaMux_ParamWrite          1024.0 ± 0%     720.0 ± 0%  -29.69% (n=100)
GorillaMux_GithubStatic        720.00 ± 0%     48.00 ± 0%  -93.33% (n=100)
GorillaMux_GithubParam         1040.0 ± 0%     736.0 ± 0%  -29.23% (n=100)
GorillaMux_GithubAll          195.0Ki ± 0%   121.8Ki ± 0%  -37.54% (n=100)
GorillaMux_GPlusStatic         720.00 ± 0%     48.00 ± 0%  -93.33% (n=100)
GorillaMux_GPlusParam          1024.0 ± 0%     720.0 ± 0%  -29.69% (n=100)
GorillaMux_GPlus2Params        1040.0 ± 0%     736.0 ± 0%  -29.23% (n=100)
GorillaMux_GPlusAll          12.484Ki ± 0%   7.906Ki ± 0%  -36.67% (n=100)
GorillaMux_ParseStatic         720.00 ± 0%     48.00 ± 0%  -93.33% (n=100)
GorillaMux_ParseParam          1024.0 ± 0%     720.0 ± 0%  -29.69% (n=100)
GorillaMux_Parse2Params        1040.0 ± 0%     736.0 ± 0%  -29.23% (n=100)
GorillaMux_ParseAll           23.08Ki ± 0%   11.77Ki ± 0%  -49.02% (n=100)
GorillaMux_StaticAll        110.394Ki ± 0%   7.359Ki ± 0%  -93.33% (n=100)
geomean                       2.688Ki        1.008Ki       -62.49%

                        │ /logs/before.txt │  /logs/after-omit-route.txt  │
                        │    allocs/op     │  allocs/op   vs base         │
GorillaMux_Param                8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_Param5               8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_Param20             10.000 ± 0%    6.000 ± 0%  -40.00% (n=100)
GorillaMux_ParamWrite           8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_GithubStatic         7.000 ± 0%    1.000 ± 0%  -85.71% (n=100)
GorillaMux_GithubParam          8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_GithubAll           1.588k ± 0%   1.038k ± 0%  -34.63% (n=100)
GorillaMux_GPlusStatic          7.000 ± 0%    1.000 ± 0%  -85.71% (n=100)
GorillaMux_GPlusParam           8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_GPlus2Params         8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_GPlusAll            102.00 ± 0%    68.00 ± 0%  -33.33% (n=100)
GorillaMux_ParseStatic          7.000 ± 0%    1.000 ± 0%  -85.71% (n=100)
GorillaMux_ParseParam           8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_Parse2Params         8.000 ± 0%    6.000 ± 0%  -25.00% (n=100)
GorillaMux_ParseAll             198.0 ± 0%    106.0 ± 0%  -46.46% (n=100)
GorillaMux_StaticAll           1099.0 ± 0%    157.0 ± 0%  -85.71% (n=100)
geomean                         21.46         10.11       -52.91%

Before vs After

goos: linux
goarch: amd64
pkg: github.com/julienschmidt/go-http-routing-benchmark
cpu: Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
                        │ /logs/before.txt │       /logs/after.txt        │
                        │      sec/op      │   sec/op     vs base         │
GorillaMux_Param               2.429µ ± 1%   2.097µ ± 1%  -13.63% (n=100)
GorillaMux_Param5              3.664µ ± 1%   3.197µ ± 1%  -12.75% (n=100)
GorillaMux_Param20             9.108µ ± 1%   6.308µ ± 1%  -30.75% (n=100)
GorillaMux_ParamWrite          2.477µ ± 1%   2.264µ ± 1%   -8.60% (n=100)
GorillaMux_GithubStatic        5.673µ ± 1%   4.556µ ± 0%  -19.70% (n=100)
GorillaMux_GithubParam         8.213µ ± 0%   7.829µ ± 0%   -4.68% (n=100)
GorillaMux_GithubAll           4.464m ± 1%   4.320m ± 1%   -3.21% (n=100)
GorillaMux_GPlusStatic        1712.0n ± 1%   869.9n ± 1%  -49.19% (n=100)
GorillaMux_GPlusParam          3.270µ ± 1%   2.920µ ± 1%  -10.69% (n=100)
GorillaMux_GPlus2Params        6.418µ ± 1%   6.161µ ± 1%   -4.00% (n=100)
GorillaMux_GPlusAll            50.79µ ± 1%   46.16µ ± 1%   -9.12% (n=100)
GorillaMux_ParseStatic         2.065µ ± 1%   1.249µ ± 1%  -39.52% (n=100)
GorillaMux_ParseParam          2.544µ ± 1%   2.253µ ± 1%  -11.44% (n=100)
GorillaMux_Parse2Params        3.062µ ± 1%   2.797µ ± 1%   -8.65% (n=100)
GorillaMux_ParseAll           102.66µ ± 2%   86.81µ ± 1%  -15.44% (n=100)
GorillaMux_StaticAll          1139.7µ ± 1%   927.5µ ± 0%  -18.62% (n=100)
geomean                        11.81µ        9.758µ       -17.38%

                        │ /logs/before.txt │        /logs/after.txt        │
                        │       B/op       │     B/op      vs base         │
GorillaMux_Param               1024.0 ± 0%     768.0 ± 0%  -25.00% (n=100)
GorillaMux_Param5              1088.0 ± 0%     832.0 ± 0%  -23.53% (n=100)
GorillaMux_Param20            3.087Ki ± 0%   1.952Ki ± 0%  -36.76% (n=100)
GorillaMux_ParamWrite          1024.0 ± 0%     768.0 ± 0%  -25.00% (n=100)
GorillaMux_GithubStatic         720.0 ± 0%     352.0 ± 0%  -51.11% (n=100)
GorillaMux_GithubParam         1040.0 ± 0%     784.0 ± 0%  -24.62% (n=100)
GorillaMux_GithubAll          195.0Ki ± 0%   140.3Ki ± 0%  -28.04% (n=100)
GorillaMux_GPlusStatic          720.0 ± 0%     352.0 ± 0%  -51.11% (n=100)
GorillaMux_GPlusParam          1024.0 ± 0%     768.0 ± 0%  -25.00% (n=100)
GorillaMux_GPlus2Params        1040.0 ± 0%     784.0 ± 0%  -24.62% (n=100)
GorillaMux_GPlusAll          12.484Ki ± 0%   9.016Ki ± 0%  -27.78% (n=100)
GorillaMux_ParseStatic          720.0 ± 0%     352.0 ± 0%  -51.11% (n=100)
GorillaMux_ParseParam          1024.0 ± 0%     768.0 ± 0%  -25.00% (n=100)
GorillaMux_Parse2Params        1040.0 ± 0%     784.0 ± 0%  -24.62% (n=100)
GorillaMux_ParseAll           23.08Ki ± 0%   15.48Ki ± 0%  -32.90% (n=100)
GorillaMux_StaticAll         110.39Ki ± 0%   53.97Ki ± 0%  -51.11% (n=100)
geomean                       2.688Ki        1.775Ki       -33.97%

                        │ /logs/before.txt │       /logs/after.txt        │
                        │    allocs/op     │  allocs/op   vs base         │
GorillaMux_Param                8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_Param5               8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_Param20             10.000 ± 0%    7.000 ± 0%  -30.00% (n=100)
GorillaMux_ParamWrite           8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_GithubStatic         7.000 ± 0%    3.000 ± 0%  -57.14% (n=100)
GorillaMux_GithubParam          8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_GithubAll           1.588k ± 0%   1.277k ± 0%  -19.58% (n=100)
GorillaMux_GPlusStatic          7.000 ± 0%    3.000 ± 0%  -57.14% (n=100)
GorillaMux_GPlusParam           8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_GPlus2Params         8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_GPlusAll            102.00 ± 0%    83.00 ± 0%  -18.63% (n=100)
GorillaMux_ParseStatic          7.000 ± 0%    3.000 ± 0%  -57.14% (n=100)
GorillaMux_ParseParam           8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_Parse2Params         8.000 ± 0%    7.000 ± 0%  -12.50% (n=100)
GorillaMux_ParseAll             198.0 ± 0%    142.0 ± 0%  -28.28% (n=100)
GorillaMux_StaticAll           1099.0 ± 0%    471.0 ± 0%  -57.14% (n=100)
geomean                         21.46         15.15       -29.40%

@das7pad
Copy link
Contributor Author

das7pad commented Sep 4, 2023

FWIW: The commits of this PR were cherry-picked into MinIO's fork of mux. The fork powered the MinIO server for the past 7 months, via minio/minio#16456. This exposure gave the PR a very decent "manual test in production".
(For context: MinIO is a popular self-hosted S3 compatible object storage server.)

@das7pad das7pad requested a review from jackgris September 4, 2023 18:43
@luisdavim
Copy link

@coreydaley , any chances you could have a look and maybe get this merged? Thanks.

@coreydaley coreydaley enabled auto-merge (squash) November 13, 2023 00:58
@coreydaley coreydaley requested review from coreydaley and AlexVulaj and removed request for jackgris November 13, 2023 01:00
@coreydaley
Copy link
Contributor

@AlexVulaj AFAIK this looks fine, would you mind also taking a look at it?

mux.go Outdated Show resolved Hide resolved
mux.go Show resolved Hide resolved
@AlexVulaj
Copy link
Member

AlexVulaj commented Nov 13, 2023

Left a few comments - it also looks like something (unrelated to this PR) is triggering a flag with the security scan for go 1.21. @coreydaley we can likely ignore that for this PR but we should take a deeper look into that.

auto-merge was automatically disabled November 16, 2023 18:56

Head branch was pushed to by a user without write access

das7pad and others added 7 commits December 6, 2023 23:55
Signed-off-by: Jakob Ackermann <das7pad@outlook.com>
Save 3 allocations worth 448B per served request with no vars.
Save 2 allocations worth 400B per served request with vars.
Populating the request ctx before vs after is O(1400ns) vs O(200ns).

```
$ go test -benchmem -benchtime 1000000x -bench BenchmarkVars
goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkVarsOld-8       1000000              1430 ns/op             896 B/op          6 allocs/op
BenchmarkVarsEmpty-8     1000000               184.3 ns/op           448 B/op          3 allocs/op
BenchmarkVarsSet-8       1000000               221.7 ns/op           496 B/op          4 allocs/op
PASS
ok      github.com/gorilla/mux  1.863s
$ go test -benchmem -benchtime 1000000x -bench BenchmarkVars
goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkVarsOld-8       1000000              1435 ns/op             896 B/op          6 allocs/op
BenchmarkVarsEmpty-8     1000000               184.3 ns/op           448 B/op          3 allocs/op
BenchmarkVarsSet-8       1000000               228.2 ns/op           496 B/op          4 allocs/op
PASS
ok      github.com/gorilla/mux  1.876s
 go test -benchmem -benchtime 1000000x -bench BenchmarkVars
goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkVarsOld-8       1000000              1390 ns/op             896 B/op          6 allocs/op
BenchmarkVarsEmpty-8     1000000               188.8 ns/op           448 B/op          3 allocs/op
BenchmarkVarsSet-8       1000000               225.8 ns/op           496 B/op          4 allocs/op
PASS
ok      github.com/gorilla/mux  1.832s
```

```
$ go test -benchmem -benchtime 5000000x -bench BenchmarkPopulateContext
goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkPopulateContext/no_populated_vars-8             5000000               570.6 ns/op           560 B/op          6 allocs/op
BenchmarkPopulateContext/empty_var-8                     5000000               872.3 ns/op           928 B/op          9 allocs/op
BenchmarkPopulateContext/populated_vars-8                5000000               861.6 ns/op           912 B/op          8 allocs/op
PASS
ok      github.com/gorilla/mux  11.547s
```

```
func requestWithVarsOld(r *http.Request, vars map[string]string) *http.Request {
	ctx := context.WithValue(r.Context(), varsKey, vars)
	return r.WithContext(ctx)
}

func requestWithRouteOld(r *http.Request, route *Route) *http.Request {
	ctx := context.WithValue(r.Context(), routeKey, route)
	return r.WithContext(ctx)
}

func BenchmarkVarsOld(b *testing.B) {
	req := newRequest(http.MethodGet, "http://localhost/")
	r := new(Route)
	var vars map[string]string
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		req = requestWithVarsOld(req, vars)
		req = requestWithRouteOld(req, r)
	}
}

func BenchmarkVarsEmpty(b *testing.B) {
	req := newRequest(http.MethodGet, "http://localhost/")
	r := new(Route)
	var vars map[string]string
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		requestWithRouteAndVars(req, r, vars)
	}
}

func BenchmarkVarsSet(b *testing.B) {
	req := newRequest(http.MethodGet, "http://localhost/")
	r := new(Route)
	vars := map[string]string{"foo": "bar"}
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		requestWithRouteAndVars(req, r, vars)
	}
}
```

Signed-off-by: Jakob Ackermann <das7pad@outlook.com>
Save one allocation worth 48B per request on route w/o vars.

Before:
```
$ go test -benchmem -benchtime 5000000x -bench BenchmarkPopulateContext
goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkPopulateContext/no_populated_vars-8             5000000               570.6 ns/op           560 B/op          6 allocs/op
BenchmarkPopulateContext/empty_var-8                     5000000               872.3 ns/op           928 B/op          9 allocs/op
BenchmarkPopulateContext/populated_vars-8                5000000               861.6 ns/op           912 B/op          8 allocs/op
PASS
ok      github.com/gorilla/mux  11.547s
```

After:
```
$ go test -benchmem -benchtime 5000000x -bench BenchmarkPopulateContext
goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkPopulateContext/no_populated_vars-8             5000000               530.7 ns/op           512 B/op          5 allocs/op
BenchmarkPopulateContext/empty_var-8                     5000000               969.2 ns/op           928 B/op          9 allocs/op
BenchmarkPopulateContext/populated_vars-8                5000000               944.7 ns/op           912 B/op          8 allocs/op
PASS
ok      github.com/gorilla/mux  12.246s
```

Signed-off-by: Jakob Ackermann <das7pad@outlook.com>
Save one allocation worth 16B per route matcher w/o named regexes/vars.
Also save one extra regex pass per route matcher w/o named regexes/vars.

Before:
```
$ go test -benchmem -benchtime 5000000x -bench BenchmarkMuxSimple
goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkMuxSimple-8     5000000               477.8 ns/op           512 B/op          5 allocs/op
PASS
ok      github.com/gorilla/mux  2.410s
```

After:
```
$ go test -benchmem -benchtime 5000000x -bench BenchmarkMuxSimple
goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkMuxSimple-8     5000000               379.3 ns/op           496 B/op          4 allocs/op
PASS
ok      github.com/gorilla/mux  1.917s
```

Signed-off-by: Jakob Ackermann <das7pad@outlook.com>
Save 4 allocations worth 200B per request cycle with redirect.
The rewrite operation takes before O(600ns) vs after O(200ns).

```
$ go test -benchmem -benchtime 5000000x -bench BenchmarkStrictSlash
goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkStrictSlashClone-8      5000000               183.9 ns/op            64 B/op          4 allocs/op
BenchmarkStrictSlashParse-8      5000000               559.8 ns/op           264 B/op          8 allocs/op
PASS
ok      github.com/gorilla/mux  3.740s

$ go test -benchmem -benchtime 5000000x -bench BenchmarkStrictSlash
goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkStrictSlashClone-8      5000000               180.4 ns/op            64 B/op          4 allocs/op
BenchmarkStrictSlashParse-8      5000000               573.5 ns/op           264 B/op          8 allocs/op
PASS
ok      github.com/gorilla/mux  3.788s

$ go test -benchmem -benchtime 5000000x -bench BenchmarkStrictSlash
goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkStrictSlashClone-8      5000000               175.8 ns/op            64 B/op          4 allocs/op
BenchmarkStrictSlashParse-8      5000000               569.4 ns/op           264 B/op          8 allocs/op
PASS
ok      github.com/gorilla/mux  3.744s
```

```
func BenchmarkStrictSlashClone(b *testing.B) {
	req := newRequest(http.MethodGet, "http://localhost/x")
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		_ = replaceURLPath(req.URL, req.URL.Path+"/")
	}
}

func BenchmarkStrictSlashParse(b *testing.B) {
	req := newRequest(http.MethodGet, "http://localhost/x")
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		u, _ := url.Parse(req.URL.String())
		u.Path += "/"
		_ = u.String()
	}
}
```

Signed-off-by: Jakob Ackermann <das7pad@outlook.com>
Optionally save 3 allocations worth 448B per request with no vars.

```
$ go test -benchmem -benchtime 5000000x -bench BenchmarkMuxSimple
goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkMuxSimple/default-8                             5000000               349.3 ns/op           496 B/op          4 allocs/op
BenchmarkMuxSimple/omit_route_from_ctx-8                 5000000               157.8 ns/op            48 B/op          1 allocs/op
PASS
ok      github.com/gorilla/mux  2.556s

$ go test -benchmem -benchtime 5000000x -bench BenchmarkMuxSimple
goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkMuxSimple/default-8                             5000000               354.7 ns/op           496 B/op          4 allocs/op
BenchmarkMuxSimple/omit_route_from_ctx-8                 5000000               160.8 ns/op            48 B/op          1 allocs/op
PASS
ok      github.com/gorilla/mux  2.602s

$ go test -benchmem -benchtime 5000000x -bench BenchmarkMuxSimple
goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkMuxSimple/default-8                             5000000               376.4 ns/op           496 B/op          4 allocs/op
BenchmarkMuxSimple/omit_route_from_ctx-8                 5000000               168.1 ns/op            48 B/op          1 allocs/op
PASS
ok      github.com/gorilla/mux  2.745s
```

```
$ go test -benchmem -benchtime 5000000x -bench BenchmarkPopulateContext
goos: linux
goarch: amd64
pkg: github.com/gorilla/mux
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkPopulateContext/no_populated_vars-8             5000000               381.6 ns/op           496 B/op          4 allocs/op
BenchmarkPopulateContext/empty_var-8                     5000000               913.6 ns/op           928 B/op          9 allocs/op
BenchmarkPopulateContext/populated_vars-8                5000000               914.0 ns/op           912 B/op          8 allocs/op
BenchmarkPopulateContext/omit_route_/static-8            5000000               168.6 ns/op            48 B/op          1 allocs/op
BenchmarkPopulateContext/omit_route_/dynamic-8           5000000               827.4 ns/op           880 B/op          8 allocs/op
PASS
ok      github.com/gorilla/mux  16.049s
```

Signed-off-by: Jakob Ackermann <das7pad@outlook.com>
Co-authored-by: Alex Vulaj <avulaj@redhat.com>
Signed-off-by: Jakob Ackermann <das7pad@outlook.com>
@AlexVulaj AlexVulaj force-pushed the perf-cut-allocations branch from 83efd14 to e136241 Compare December 7, 2023 04:55
@AlexVulaj AlexVulaj merged commit e44017d into gorilla:main Dec 7, 2023
10 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

6 participants