Enable benchmark monitoring with regression CI hook #2265

jdegoes · 2023-06-16T13:50:51Z

We need JMH-based benchmarks to be run as part of CI, with automatic failure if performance on some benchmark falls below some threshold set in configuration.

jdegoes · 2023-07-27T12:40:04Z

/bounty $750

algora-pbc · 2023-07-27T12:40:06Z

💎 $750 bounty • ZIO

Steps to solve:

Start working: Comment /attempt #2265 with your implementation plan
Submit work: Create a pull request including /claim #2265 in the PR body to claim the bounty
Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

Additional opportunities:

🔴 Livestream on Algora TV while solving this bounty & earn $200 upon merge! Comment /livestream once live

Thank you for contributing to zio/zio-http!

Add a bounty • Share on socials

Attempt	Started (GMT+0)	Solution
🔴 @kitlangton	Aug 3, 2023, 11:28:32 AM	WIP
🔴 @uzmi1	Oct 28, 2023, 2:50:36 PM	WIP
🔴 @nermalcat69	Dec 6, 2023, 5:49:54 PM	WIP
🟢 @alankritdabral	Mar 24, 2024, 2:12:38 PM	WIP

kitlangton · 2023-08-03T11:28:31Z

I've been making some good progress on this in a separate repo. /attempt #2265

I'm going to make a GitHub action that will parse the JMH output and compare its performance agains past run data (serialized and stored in a separate branch). If the benchmarks fall beneath the configured threshold, it will fail CI. I'm also going to try to have it post the benchmark results as a comment on the Pull Request.

This action can be a separate zio org project if it proves useful. It should be generic enough to attach to any zio project. (Also, the action itself is written with ZIO and Scala.js—so that's fun!)

Options

Cancel my attempt

kitlangton · 2023-08-03T13:31:39Z

Making some progress:

Let me know if you have any design thoughts/questions :)

plokhotnyuk · 2023-08-03T15:03:40Z

Thank you for your serious attitude towards zio-http performance! It is already one from the most fastests contemporary Scala web-servers, just see results here and here.

Here is a couple of ideas for JMH benchmarking:

Use gc and perfasm JMH profilers to store allocation rates as auxilary metrics and disassembled code generated by JIT for hottest places together with throughput results.
Use JMH visualizer to easily see main and auxilary metrics together with their confidence range and compare them interactively using references to raw .json files on GitHub, like here

Also, my 2 cents for HTTP-server benchmarking:

Measure latency for different combinations of fixed throughput rates and numbers opened of connections using wrk2. Here is a great talk of @giltene about understanding latency and measuring application responsiveness.
Use async-profiler during benchmarking to see what is happening under the hood. It's shows almost all what is happening under the hood: JVM, C++, kernel stack frames, virtual and interface calls (vtable and itable). If results are stored to .jfr format then they could be converted to Netflix's flamescope format and then be browsed interactively with 10ms granularity to observe different mode of server working (warming up, GC-ing, etc.).

kitlangton · 2023-08-04T16:18:01Z

UPDATE: I've created the following JMH Benchmark Action repository.

One bigger concern is that these benchmarks take a good deal of time to run—even on a relatively powerful M2 Mac. Doing this in CI, even configured with fewer iterations/forks, will take a good deal of time. One option would be to only run this action if there's a benchmark label added to the PR? Then, the maintainers can opt-in to benchmarks when it seems relevant to the work being done. (This is something that can be done with a few lines in the workflow yaml, so it doesn't have to be part of the main action)

UPDATE: And, as usual, the complexity unfurls itself as you approach the end. It turns out it's not as simple to merely "comment on a pull request" as it first appeared (more info here: #2369). But I have spotted a workaround.

Another thought from that PR: There's a lot of variance in certain very high ops/s benchmarks, so I should probably take the standard deviation into account when attempting to identify a regression, instead of just naively comparing the final scores.

kitlangton · 2023-08-04T20:38:51Z

/attempt #2265

Options

Cancel my attempt

kitlangton · 2023-08-05T13:36:14Z

Alrighty. A summary of open design questions:

How to report the results:
- Comment on PR: The naive way of commenting on a PR from within a workflow is rife with difficulty stemming from security issues. (https://securitylab.github.com/research/github-actions-preventing-pwn-requests/). After doing some research, it seems the safest way of achieving the PR comment summary would be to use 2 workflows, one for running benchmarks and another for commenting (this approach is described in the linked article). Example:
- Job Summary: Alternatively, would posting a job summary be a simpler and more efficient solution, avoiding the need for two separate workflows and the use of artifacts? Example:
Determining what counts as a regression
- There's a lot of variance in certain very high ops/s benchmarks, so I should probably take the standard deviation into account when attempting to identify a regression, instead of just naively comparing the final scores.
When should we run the benchmark workflow?
- Running every benchmark on every PR commit, even with a modest number of iterations, will chew through CI hours. Perhaps we could opt in to running benchmarks by having the workflow check for a Benchmark label. Another option, would be to explicitly run the benchmarks by having it watch for a comment like "/benchmark" or something, posted by a maintainer.

Ahmadkhan02 · 2023-10-05T19:43:24Z

@kitlangton are you still on this or can I make an attempt?

uzmi1 · 2023-10-28T14:44:58Z

/claim #2502

uzmi1 · 2023-10-28T14:50:35Z

/attempt #2265

Options

Cancel my attempt

uzmi1 · 2023-10-29T03:12:39Z

Hi Jdegoes- check solution-
Bug Description:
The current implementation lacks benchmark monitoring, and there is no CI hook for regression testing. This creates a gap in performance monitoring, potentially leading to undetected regressions and performance issues. The absence of benchmark monitoring makes it challenging to identify changes that negatively impact system performance.

Impact:

Undetected Performance Regressions:

Without benchmark monitoring, performance regressions may go unnoticed, leading to degraded system performance.
Missing Continuous Integration (CI) Hook:

Lack of a CI hook for regression testing means changes in the codebase may not undergo performance testing during the CI/CD pipeline.
Steps to Reproduce:

Inspect Current Monitoring Setup:

Observe the absence of benchmark monitoring in the current system.
Verify that there is no CI hook for regression testing related to performance.
Attempt to Enable Benchmark Monitoring:

Explore the system configuration or relevant scripts to enable benchmark monitoring.
Check for existing CI hooks related to performance.
Verify Implementation:

Execute benchmark monitoring after attempting to enable it.
Check if the CI hook triggers regression testing for performance-related changes.
Expected Behaviour:
1.Benchmark Monitoring Enabled:

After the task is completed, benchmark monitoring should be active, capturing relevant performance metrics.
CI Hook for Regression Testing:
A CI hook should be in place to trigger regression testing for performance-related changes in the codebase.
Suggested Solution:

Benchmark Monitoring:

Integrate a suitable benchmark monitoring tool or solution into the system configuration.
Configure the monitoring tool to capture relevant performance metrics.
CI Hook for Regression Testing:

Implement a CI hook that triggers regression testing for performance-related changes.
Integrate the CI hook into the existing CI/CD pipeline.
Code Implementation Example:

Example CI/CD Configuration (GitLab CI)

stages:

test

benchmark:
stage: test
script:
- ./run_benchmarks.sh
Recommendation:

Ensure the selected benchmark monitoring tool aligns with system requirements.
Regularly review and update the benchmark metrics being monitored to reflect evolving performance expectations.
Reported by:
Uzma Qureshi

Proof of Concept
simple proof of concept (PoC) to enable benchmark monitoring. Note that this is a generic example, and you may need to customize it based on your specific environment and the benchmark monitoring tool you choose.

Assuming you are using a Unix-like system and want to integrate Apache Benchmark (ab) for benchmarking, here's a basic script:

run_benchmarks.sh:

#!/bin/bash

Set variables

TARGET_URL="http://your-api-endpoint.com/"
BENCHMARK_RESULTS_FILE="benchmark_results.txt"

Run Apache Benchmark (ab)

ab -n 100 -c 10 $TARGET_URL > $BENCHMARK_RESULTS_FILE

Print benchmark results

cat $BENCHMARK_RESULTS_FILE

This script does the following:

It sends 100 requests (-n 100) with a concurrency of 10 (-c 10) to the specified API endpoint ($TARGET_URL).
The benchmark results are saved in a file named benchmark_results.txt.
The script then prints the benchmark results to the console.
Remember to replace "http://your-api-endpoint.com/" with the actual URL you want to benchmark.

algora-pbc · 2023-11-04T14:50:40Z

@uzmi1: Reminder that in 7 days the bounty will become up for grabs, so please submit a pull request before then 🙏

uzmi1 · 2023-11-05T14:44:33Z

/claim #2265

algora-pbc · 2023-11-11T14:50:42Z

The bounty is up for grabs! Everyone is welcome to /attempt #2265 🙌

nermalcat69 · 2023-12-06T17:49:52Z

/attempt #2265

Options

Cancel my attempt

algora-pbc · 2023-12-13T17:49:58Z

@nermalcat69: Reminder that in 7 days the bounty will become up for grabs, so please submit a pull request before then 🙏

algora-pbc · 2023-12-20T17:49:57Z

The bounty is up for grabs! Everyone is welcome to /attempt #2265 🙌

alankritdabral · 2024-03-24T14:12:14Z

After digging little bit i found few flaws in the build :

The path to benchmark files is outdated which results in no jmh bechmarks in ci.yml .
The jdk version 8 in ci.yml results error during jmh run.
The UtilBenchmark runs in avgt mode unlike others so using grep "thrpt" will throw error.

alankritdabral · 2024-03-24T14:12:36Z

/attempt 2265

Algora profile	Completed bounties	Tech	Active attempts	Options
@alankritdabral	2 bounties from 2 projects			Cancel attempt

alankritdabral · 2024-03-24T14:59:14Z

Currently, we're running each benchmark in parallel for both the current branch and the base branch, which doubles the time required. The approach I'm considering is to run the base benchmark with each push to the main branch and save its artifact. During a pull request run, we'll execute the benchmarks for the current branch, download the base artifacts, compare the current benchmarks with the base benchmarks using a shell script, and upload the results simultaneously. If the benchmarks exceed a certain threshold, we will break the CI.

I will divide the task into two PRs.

Firstly, run benchmarks on push to the main branch only and save them as cache. chore: cache jmh benchmarks #2750
Secondly, run benchmarks on each pull request and compare its results with the base benchmarks and show the difference. chore: compare jmh benchmarks #2751

alankritdabral · 2024-03-29T08:07:04Z

@jdegoes I have created a pull request for the current issue pull_request.
Hope you find this pr useful. 😄 :

alankritdabral · 2024-04-05T15:27:08Z

hey @jdegoes #2751 would completely close this issue as i have i have divided the solution in two pr as stated in the above comment can you review; it its in working state.

jdegoes added the enhancement New feature or request label Jun 16, 2023

algora-pbc bot added the 💎 Bounty label Jul 27, 2023

kitlangton mentioned this issue Aug 4, 2023

Add Benchmark CI #2364

Closed

kitlangton mentioned this issue Aug 4, 2023

Add benchmark CI #2368

Closed

This was referenced Oct 11, 2023

Client Performance Not Great OOTB #2117

Closed

Move the performance-tests into separate units jamesward/zio-http-client-perf#5

Open

This was referenced Mar 29, 2024

chore: cache jmh benchmarks #2750

Merged

chore: compare jmh benchmarks #2751

Closed

jdegoes closed this as completed in #2750 Apr 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable benchmark monitoring with regression CI hook #2265

Enable benchmark monitoring with regression CI hook #2265

jdegoes commented Jun 16, 2023 •

edited

Loading

jdegoes commented Jul 27, 2023

algora-pbc bot commented Jul 27, 2023 •

edited

Loading

kitlangton commented Aug 3, 2023 •

edited

Loading

kitlangton commented Aug 3, 2023 •

edited

Loading

plokhotnyuk commented Aug 3, 2023 •

edited

Loading

kitlangton commented Aug 4, 2023 •

edited

Loading

kitlangton commented Aug 4, 2023 •

edited by algora-pbc bot

Loading

kitlangton commented Aug 5, 2023 •

edited

Loading

Ahmadkhan02 commented Oct 5, 2023

uzmi1 commented Oct 28, 2023

uzmi1 commented Oct 28, 2023 •

edited by algora-pbc bot

Loading

uzmi1 commented Oct 29, 2023

algora-pbc bot commented Nov 4, 2023

uzmi1 commented Nov 5, 2023

algora-pbc bot commented Nov 11, 2023

nermalcat69 commented Dec 6, 2023 •

edited by algora-pbc bot

Loading

algora-pbc bot commented Dec 13, 2023

algora-pbc bot commented Dec 20, 2023

alankritdabral commented Mar 24, 2024 •

edited

Loading

alankritdabral commented Mar 24, 2024 •

edited by algora-pbc bot

Loading

alankritdabral commented Mar 24, 2024 •

edited

Loading

alankritdabral commented Mar 29, 2024

alankritdabral commented Apr 5, 2024 •

edited

Loading

Enable benchmark monitoring with regression CI hook #2265

Enable benchmark monitoring with regression CI hook #2265

Comments

jdegoes commented Jun 16, 2023 • edited Loading

jdegoes commented Jul 27, 2023

algora-pbc bot commented Jul 27, 2023 • edited Loading

💎 $750 bounty • ZIO

Steps to solve:

Additional opportunities:

kitlangton commented Aug 3, 2023 • edited Loading

kitlangton commented Aug 3, 2023 • edited Loading

plokhotnyuk commented Aug 3, 2023 • edited Loading

kitlangton commented Aug 4, 2023 • edited Loading

kitlangton commented Aug 4, 2023 • edited by algora-pbc bot Loading

kitlangton commented Aug 5, 2023 • edited Loading

Ahmadkhan02 commented Oct 5, 2023

uzmi1 commented Oct 28, 2023

uzmi1 commented Oct 28, 2023 • edited by algora-pbc bot Loading

uzmi1 commented Oct 29, 2023

Example CI/CD Configuration (GitLab CI)

algora-pbc bot commented Nov 4, 2023

uzmi1 commented Nov 5, 2023

algora-pbc bot commented Nov 11, 2023

nermalcat69 commented Dec 6, 2023 • edited by algora-pbc bot Loading

algora-pbc bot commented Dec 13, 2023

algora-pbc bot commented Dec 20, 2023

alankritdabral commented Mar 24, 2024 • edited Loading

alankritdabral commented Mar 24, 2024 • edited by algora-pbc bot Loading

alankritdabral commented Mar 24, 2024 • edited Loading

alankritdabral commented Mar 29, 2024

alankritdabral commented Apr 5, 2024 • edited Loading

jdegoes commented Jun 16, 2023 •

edited

Loading

algora-pbc bot commented Jul 27, 2023 •

edited

Loading

kitlangton commented Aug 3, 2023 •

edited

Loading

kitlangton commented Aug 3, 2023 •

edited

Loading

plokhotnyuk commented Aug 3, 2023 •

edited

Loading

kitlangton commented Aug 4, 2023 •

edited

Loading

kitlangton commented Aug 4, 2023 •

edited by algora-pbc bot

Loading

kitlangton commented Aug 5, 2023 •

edited

Loading

uzmi1 commented Oct 28, 2023 •

edited by algora-pbc bot

Loading

nermalcat69 commented Dec 6, 2023 •

edited by algora-pbc bot

Loading

alankritdabral commented Mar 24, 2024 •

edited

Loading

alankritdabral commented Mar 24, 2024 •

edited by algora-pbc bot

Loading

alankritdabral commented Mar 24, 2024 •

edited

Loading

alankritdabral commented Apr 5, 2024 •

edited

Loading