safekeeper: add basic WAL ingestion benchmarks #9531

erikgrinaker · 2024-10-26T12:15:39Z

Problem

We don't have any benchmarks for Safekeeper WAL ingestion.

Resolves #9339.
Blocked by #9614.

Summary of changes

Add some basic benchmarks for WAL ingestion, specifically for SafeKeeper::process_msg() (single append) and WalAcceptor (pipelined batch ingestion).

Checklist before requesting a review

I have performed a self-review of my code.
If it is a core feature, I have added thorough tests.
Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

Do not forget to reformat commit message to not include the above checklist
In

github-actions · 2024-10-26T13:08:39Z

5328 tests run: 5100 passed, 6 failed, 222 skipped (full report)

Failures on Postgres 17

test_pg_regress[None]: release-x86-64, release-arm64, debug-x86-64
test_pg_regress[4]: release-x86-64, release-arm64, debug-x86-64

# Run all failed tests locally:
scripts/pytest -vv -n $(nproc) -k "test_pg_regress[release-pg17-None] or test_pg_regress[release-pg17-None] or test_pg_regress[debug-pg17-None] or test_pg_regress[release-pg17-4] or test_pg_regress[release-pg17-4] or test_pg_regress[debug-pg17-4]"

Test coverage report is not available

_{The comment gets automatically updated with the latest test results
f4b8795 at 2024-11-03T17:25:29.676Z :recycle:}

VladLazar

LGTM, maybe Arseny could take a look as well for some more SK context?

VladLazar · 2024-11-01T12:19:45Z

safekeeper/benches/receive_wal.rs

+/// each individual message to amortize costs (e.g. fsync), which is more realistic. Records are
+/// XlLogicalMessage with a tiny payload.
+///
+/// TODO: add benchmarks with larger data volume, and measure throughput.


Yeah, I expected this benchmark to measure tput. Does it not make sense to do that now?

Sure, I'll add it.

Added a separate wal_acceptor_throughput benchmark, with the intent that wal_acceptor measures the cost of ingesting trivial messages (i.e. per-message processing latency), while wal_acceptor_throughput measures bulk ingestion (typically IO-bound).

Smaller messages really kill throughput. Lower than 1 KB per message wasn't viable since the benchmark would take 30 minutes to ingest 1 GB. Haven't looked into why, might be some low-hanging fruit here.

wal_acceptor_throughput/fsync=false/size=1024 time: [14.427 s 14.487 s 14.550 s] thrpt: [70.378 MiB/s 70.683 MiB/s 70.979 MiB/s] wal_acceptor_throughput/fsync=false/size=4096 time: [4.8211 s 4.8395 s 4.8602 s] thrpt: [210.69 MiB/s 211.59 MiB/s 212.40 MiB/s] wal_acceptor_throughput/fsync=false/size=131072 time: [1.9312 s 1.9410 s 1.9518 s] thrpt: [524.66 MiB/s 527.55 MiB/s 530.25 MiB/s] wal_acceptor_throughput/fsync=false/size=1048576 time: [1.9040 s 1.9095 s 1.9155 s] thrpt: [534.60 MiB/s 536.27 MiB/s 537.81 MiB/s]

safekeeper/benches/receive_wal.rs

safekeeper/Cargo.toml

erikgrinaker requested a review from VladLazar October 26, 2024 12:15

erikgrinaker requested a review from a team as a code owner October 26, 2024 12:15

erikgrinaker force-pushed the erik/safekeeper-wal-benchmarks branch from 62d1702 to dfa57a3 Compare October 28, 2024 09:09

Base automatically changed from erik/wal-generator to main October 30, 2024 11:46

arssher requested a review from a team as a code owner October 30, 2024 11:46

arssher requested a review from knizhnik October 30, 2024 11:46

erikgrinaker force-pushed the erik/safekeeper-wal-benchmarks branch from dfa57a3 to c00fdc9 Compare October 30, 2024 11:55

erikgrinaker removed request for a team and knizhnik October 30, 2024 11:57

erikgrinaker force-pushed the erik/safekeeper-wal-benchmarks branch from c00fdc9 to c67e894 Compare October 30, 2024 12:14

VladLazar approved these changes Nov 1, 2024

View reviewed changes

VladLazar requested a review from arssher November 1, 2024 12:24

safekeeper: add basic WAL ingestion benchmarks

6eaf9b9

erikgrinaker force-pushed the erik/safekeeper-wal-benchmarks branch from c67e894 to 6eaf9b9 Compare November 3, 2024 14:38

erikgrinaker changed the base branch from main to erik/wal-generator-generic November 3, 2024 14:38

erikgrinaker added 2 commits November 3, 2024 15:42

Move itertools to dev-dependencies

45fe123

Add WalAcceptor throughput benchmark

f4b8795

erikgrinaker force-pushed the erik/safekeeper-wal-benchmarks branch from 6133ff1 to f4b8795 Compare November 3, 2024 16:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

safekeeper: add basic WAL ingestion benchmarks #9531

safekeeper: add basic WAL ingestion benchmarks #9531

erikgrinaker commented Oct 26, 2024 •

edited

Loading

github-actions bot commented Oct 26, 2024 •

edited

Loading

VladLazar left a comment

VladLazar Nov 1, 2024

erikgrinaker Nov 1, 2024

erikgrinaker Nov 3, 2024

safekeeper: add basic WAL ingestion benchmarks #9531

Are you sure you want to change the base?

safekeeper: add basic WAL ingestion benchmarks #9531

Conversation

erikgrinaker commented Oct 26, 2024 • edited Loading

Problem

Summary of changes

Checklist before requesting a review

Checklist before merging

github-actions bot commented Oct 26, 2024 • edited Loading

5328 tests run: 5100 passed, 6 failed, 222 skipped (full report)

Failures on Postgres 17

Test coverage report is not available

VladLazar left a comment

Choose a reason for hiding this comment

VladLazar Nov 1, 2024

Choose a reason for hiding this comment

erikgrinaker Nov 1, 2024

Choose a reason for hiding this comment

erikgrinaker Nov 3, 2024

Choose a reason for hiding this comment

erikgrinaker commented Oct 26, 2024 •

edited

Loading

github-actions bot commented Oct 26, 2024 •

edited

Loading