Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Baseline chnage doesn't fail test on detected regression #824

Open
RimaitosLab opened this issue Nov 7, 2024 · 0 comments
Open

Baseline chnage doesn't fail test on detected regression #824

RimaitosLab opened this issue Nov 7, 2024 · 0 comments

Comments

@RimaitosLab
Copy link

Hi I'm not sure whether this is more of a bug report or a feature request, but I'm currently trying to setup a CI pipeline which should perform performance regression tests (yes I'm aware that this has to be carefully designed, and I'm using a machine with the sole purpose of performance testing to improve the consistency of results).

To perform these regression tests, I first run cargo bench -- --save-baseline base on the old codebase, update the codebase to the new state and run cargo bench -- --baseline base.

However, while the second run does detect a regression, it still returns an exit code of 0, indication success to the CI pipeline.
I expected a non zero return code.

Is there any way to get criterion to return a non zero exit code when any test has regressed?
This could be useful for CI pipelines which should prevent performance regressions.

Log:

cargo bench --features benchmark -- --baseline-lenient base ; echo $?

Finished `bench` profile [optimized] target(s) in 1.58s
Running benches/cost_calculators.rs (target/release/deps/cost_calculators-f94e90839cccec69)
circulation_coffee      time:   [230\.84 µs 234.18 µs 237.60 µs]
change: [\+16.071% +18.179% +20.083%] (p = 0.00 < 0.05)
Performance has regressed.
Found 175 outliers among 2500 measurements (7.00%)
88 (3.52%) high mild
87 (3.48%) high severe

circulation_shelves     time:   [199\.26 µs 200.99 µs 202.83 µs]
change: [\+1.1908% +2.2891% +3.2929%] (p = 0.00 < 0.05)
Performance has regressed.
Found 213 outliers among 2500 measurements (8.52%)
110 (4.40%) high mild
103 (4.12%) high severe

circulation_student_dorm
time:   [96\.695 µs 97.502 µs 98.328 µs]
change: [\-4.6563% -3.5723% -2.4601%] (p = 0.00 < 0.05)
Performance has improved.
Found 124 outliers among 2500 measurements (4.96%)
93 (3.72%) high mild
31 (1.24%) high severe

[...]

0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant