forked from DataDog/datadog-agent
-
Notifications
You must be signed in to change notification settings - Fork 0
196 lines (152 loc) · 7.63 KB
/
serverless-benchmarks.yml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
name: "Serverless Benchmarks"
on:
pull_request:
paths:
- 'cmd/serverless/**'
- 'pkg/serverless/**'
- '.github/workflows/serverless-benchmarks.yml'
env:
DD_API_KEY: must-be-set
concurrency:
group: ${{ github.workflow }}/PR#${{ github.event.pull_request.number }}
cancel-in-progress: true
permissions: {}
jobs:
baseline:
name: Baseline
runs-on: ubuntu-latest
outputs:
sha: ${{ steps.prepare.outputs.sha }}
steps:
- name: Checkout ${{ github.base_ref }}
uses: actions/checkout@0ad4b8fadaa221de15dcec353f45205ec38ea70b # v4.1.4
with:
ref: ${{ github.base_ref }}
persist-credentials: false
- name: Install Go
uses: actions/setup-go@0a12ed9d6a96ab950c8f026ed9f722fe0da7ef32 # v5.0.2
with:
go-version: stable
- name: Prepare working tree
id: prepare
run: |
echo "sha=$(git rev-parse HEAD)" >> $GITHUB_OUTPUT
go get ./...
- name: Run benchmark
env:
TEMP_RUNNER: ${{runner.temp}}
run: |
go test -tags=test -run='^$' -bench=StartEndInvocation -count=10 -benchtime=500ms -timeout=60m \
./pkg/serverless/... | tee "$TEMP_RUNNER"/benchmark.log
- name: Upload result artifact
uses: actions/upload-artifact@834a144ee995460fba8ed112a2fc961b36a5ec5a # v4.3.6
with:
name: baseline.log
path: ${{runner.temp}}/benchmark.log
if-no-files-found: error
current:
name: Current
runs-on: ubuntu-latest
outputs:
sha: ${{ steps.prepare.outputs.sha }}
steps:
- name: Checkout ${{ github.ref }}
uses: actions/checkout@0ad4b8fadaa221de15dcec353f45205ec38ea70b # v4.1.4
with:
ref: ${{ github.sha }}
persist-credentials: false
- name: Install Go
uses: actions/setup-go@0a12ed9d6a96ab950c8f026ed9f722fe0da7ef32 # v5.0.2
with:
go-version: stable
- name: Prepare working tree
id: prepare
run: |
echo "sha=$(git rev-parse HEAD)" >> $GITHUB_OUTPUT
go get ./...
- name: Run benchmark
env:
TEMP_RUNNER: ${{runner.temp}}
run: |
go test -tags=test -run='^$' -bench=StartEndInvocation -count=10 -benchtime=500ms -timeout=60m \
./pkg/serverless/... | tee "$TEMP_RUNNER"/benchmark.log
- name: Upload result artifact
uses: actions/upload-artifact@834a144ee995460fba8ed112a2fc961b36a5ec5a # v4.3.6
with:
name: current.log
path: ${{runner.temp}}/benchmark.log
if-no-files-found: error
summary:
name: Summary
runs-on: ubuntu-latest
needs: [baseline, current]
permissions:
pull-requests: write
steps:
- name: Install Go
uses: actions/setup-go@0a12ed9d6a96ab950c8f026ed9f722fe0da7ef32 # v5.0.2
with:
go-version: stable
cache: false
- name: Install benchstat
run: |
go install golang.org/x/perf/cmd/benchstat@latest
- name: Download baseline artifact
uses: actions/download-artifact@65a9edc5881444af0b9093a5e628f2fe47ea3b2e # v4.1.7
with:
name: baseline.log
path: baseline
- name: Download current artifact
uses: actions/download-artifact@65a9edc5881444af0b9093a5e628f2fe47ea3b2e # v4.1.7
with:
name: current.log
path: current
- name: Analyze results
id: analyze
run: |
benchstat -row /event baseline/benchmark.log current/benchmark.log | tee analyze.txt
echo "analyze<<EOF" >> $GITHUB_OUTPUT
cat analyze.txt >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT
- name: Post comment
uses: marocchino/sticky-pull-request-comment@331f8f5b4215f0445d3c07b4967662a32a2d3e31 # v2.9.0
with:
header: serverless-benchmarks
recreate: true
message: |
## Serverless Benchmark Results
`BenchmarkStartEndInvocation` comparison between ${{ needs.baseline.outputs.sha }} and ${{ needs.current.outputs.sha }}.
<details>
<summary>tl;dr</summary>
Use these benchmarks as an insight tool during development.
1. Skim down the `vs base` column in each chart. If there is a `~`, then there was no statistically significant change to the benchmark. Otherwise, ensure the estimated percent change is either negative or very small.
2. The last row of each chart is the `geomean`. Ensure this percentage is either negative or very small.
</details>
<details>
<summary>What is this benchmarking?</summary>
The [`BenchmarkStartEndInvocation`](https://github.com/DataDog/datadog-agent/blob/main/pkg/serverless/daemon/routes_test.go) compares the amount of time it takes to call the `start-invocation` and `end-invocation` endpoints. For universal instrumentation languages (Dotnet, Golang, Java, Ruby), this represents the majority of the duration overhead added by our tracing layer.
The benchmark is run using a large variety of lambda request payloads. In the charts below, there is one row for each event payload type.
</details>
<details>
<summary>How do I interpret these charts?</summary>
The charts below comes from [`benchstat`](https://pkg.go.dev/golang.org/x/perf/cmd/benchstat). They represent the statistical change in _duration (sec/op)_, _memory overhead (B/op)_, and _allocations (allocs/op)_.
The benchstat docs explain how to interpret these charts.
> Before the comparison table, we see common file-level configuration. If there are benchmarks with different configuration (for example, from different packages), benchstat will print separate tables for each configuration.
>
> The table then compares the two input files for each benchmark. It shows the median and 95% confidence interval summaries for each benchmark before and after the change, and an A/B comparison under "vs base". ... The p-value measures how likely it is that any differences were due to random chance (i.e., noise). The "~" means benchstat did not detect a statistically significant difference between the two inputs. ...
>
> Note that "statistically significant" is not the same as "large": with enough low-noise data, even very small changes can be distinguished from noise and considered statistically significant. It is, of course, generally easier to distinguish large changes from noise.
>
> Finally, the last row of the table shows the geometric mean of each column, giving an overall picture of how the benchmarks changed. Proportional changes in the geomean reflect proportional changes in the benchmarks. For example, given n benchmarks, if sec/op for one of them increases by a factor of 2, then the sec/op geomean will increase by a factor of ⁿ√2.
</details>
<details>
<summary>I need more help</summary>
First off, do not worry if the benchmarks are failing. They are not tests. The intention is for them to be a tool for you to use during development.
If you would like a hand interpreting the results come chat with us in `#serverless-agent` in the internal DataDog slack or in `#serverless` in the [public DataDog slack](https://chat.datadoghq.com/). We're happy to help!
</details>
<details>
<summary>Benchmark stats</summary>
```
${{ steps.analyze.outputs.analyze }}
```
</details>