Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: Add perf test to measure the "cost" of chain wakeup #2599

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

xemul
Copy link
Contributor

@xemul xemul commented Dec 24, 2024

The test creates a chain of future-promise-s and then wakes up the last one thus causing the cascade of resolutions. There are 2x2 tests -- resolve with value or exception vs co_await-ed or .then()-d chains.

The result is (depth of 32)

test                   iterations      median         mad         min         max      allocs       tasks        inst      cycles
chain.then_value           992498     1.007us     1.715ns     1.001us     1.008us      32.000      33.000      9623.7         0.0
chain.await_value          669230     1.466us     2.124ns     1.457us     1.512us      33.000      34.000     12804.5         0.0
chain.then_exception       955761     1.041us     0.731ns     1.039us     1.044us      34.000      34.000      9906.2         0.0
chain.await_exception        9980    96.767us    57.374ns    96.399us    96.850us      68.000      35.000    747836.2         0.0

Waking up co-await-ed chain with exceptions is extremely expensive.

Worse, if comparing the result with depth of 8

test                   iterations      median         mad         min         max      allocs       tasks        inst      cycles
chain.then_value          3516439   269.729ns     0.459ns   269.243ns   273.017ns       8.000       9.000      2582.4         0.0
chain.await_value         2511108   397.325ns     0.919ns   396.407ns   409.861ns       9.000      10.000      3623.0         0.0
chain.then_exception      2909952   344.783ns     0.559ns   341.926ns   345.653ns      10.000      10.000      3081.4         0.0
chain.await_exception       37663    26.700us   153.423ns    26.422us    27.405us      20.000      11.000    204489.5         0.0

it's clear that exception propagation via co-awaits is expensive on every co_await, as it "scales" linearly with the chain depth.

The test creates a chain of future-promise-s and then wakes up the last
one thus causing the cascade of resolutions. There are 2x2 tests --
resolve with value or exception vs co_await-ed or .then()-d chains.

The result is (depth of 32)

test                   iterations      median         mad         min         max      allocs       tasks        inst      cycles
chain.then_value           992498     1.007us     1.715ns     1.001us     1.008us      32.000      33.000      9623.7         0.0
chain.await_value          669230     1.466us     2.124ns     1.457us     1.512us      33.000      34.000     12804.5         0.0
chain.then_exception       955761     1.041us     0.731ns     1.039us     1.044us      34.000      34.000      9906.2         0.0
chain.await_exception        9980    96.767us    57.374ns    96.399us    96.850us      68.000      35.000    747836.2         0.0

Waking up co-await-ed chain with exceptions is extremely expensive.

Worse, if comparing the result with depth of 8

test                   iterations      median         mad         min         max      allocs       tasks        inst      cycles
chain.then_value          3516439   269.729ns     0.459ns   269.243ns   273.017ns       8.000       9.000      2582.4         0.0
chain.await_value         2511108   397.325ns     0.919ns   396.407ns   409.861ns       9.000      10.000      3623.0         0.0
chain.then_exception      2909952   344.783ns     0.559ns   341.926ns   345.653ns      10.000      10.000      3081.4         0.0
chain.await_exception       37663    26.700us   153.423ns    26.422us    27.405us      20.000      11.000    204489.5         0.0

it's clear that exception propagation via co-awaits is expensive on
_every_ co_await, as it "scales" linearly with the chain depth.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
@xemul xemul requested a review from avikivity December 24, 2024 11:52
@avikivity
Copy link
Member

The test creates a chain of future-promise-s and then wakes up the last one thus causing the cascade of resolutions. There are 2x2 tests -- resolve with value or exception vs co_await-ed or .then()-d chains.

The result is (depth of 32)

test                   iterations      median         mad         min         max      allocs       tasks        inst      cycles
chain.then_value           992498     1.007us     1.715ns     1.001us     1.008us      32.000      33.000      9623.7         0.0
chain.await_value          669230     1.466us     2.124ns     1.457us     1.512us      33.000      34.000     12804.5         0.0
chain.then_exception       955761     1.041us     0.731ns     1.039us     1.044us      34.000      34.000      9906.2         0.0
chain.await_exception        9980    96.767us    57.374ns    96.399us    96.850us      68.000      35.000    747836.2         0.0

Waking up co-await-ed chain with exceptions is extremely expensive.

Worse, if comparing the result with depth of 8

test                   iterations      median         mad         min         max      allocs       tasks        inst      cycles
chain.then_value          3516439   269.729ns     0.459ns   269.243ns   273.017ns       8.000       9.000      2582.4         0.0
chain.await_value         2511108   397.325ns     0.919ns   396.407ns   409.861ns       9.000      10.000      3623.0         0.0
chain.then_exception      2909952   344.783ns     0.559ns   341.926ns   345.653ns      10.000      10.000      3081.4         0.0
chain.await_exception       37663    26.700us   153.423ns    26.422us    27.405us      20.000      11.000    204489.5         0.0

it's clear that exception propagation via co-awaits is expensive on every co_await, as it "scales" linearly with the chain depth.

Please normalize by the depth, or at least add the depth to the test name. It's not then_value that takes 1 usec, it's 32 of them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants