Skip to content

Commit

Permalink
Add a Java benchmark and a Rust benchmark
Browse files Browse the repository at this point in the history
  • Loading branch information
beef9999 committed Jul 14, 2023
1 parent d334372 commit e693e25
Showing 1 changed file with 35 additions and 32 deletions.
67 changes: 35 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ as the [photon](https://en.wikipedia.org/wiki/Photon) particle, which exactly is
* How to transform `RocksDB` from multi-threads to coroutines by only 200 lines of code?
[En](https://github.com/facebook/rocksdb/issues/11017) / [中文](https://developer.aliyun.com/article/1093864).

<details><summary>More history...</summary><p>
<details><summary>Click to show more history...</summary><p>

* Version 0.5 is released. Except for various performance improvements, including spinlock, context switch,
and new run queue for coroutine scheduling, we have re-implemented the HTTP module so that there is no `boost` dependency anymore.
Expand Down Expand Up @@ -58,11 +58,11 @@ and prepared to wrap them into the framework. It is a real killer in the low lev

Compare Photon with fio when reading an 3.5TB NVMe raw device.

| | IO Engine | IO Type | IO Size | IO Depth | DirectIO | QPS | Throughput | CPU util |
|:------:|:---------:|:---------:|:-------:|:--------:|:--------:|:----:|:----------:|:--------:|
| Photon | io_uring | Rand-read | 4KB | 128 | Yes | 433K | 1.73GB | 100% |
| Photon | libaio | Rand-read | 4KB | 128 | Yes | 346K | 1.38GB | 100% |
| fio | libaio | Rand-read | 4KB | 128 | Yes | 279K | 1.11GB | 100% |
| | IO Engine | IO Type | IO Size | IO Depth | DirectIO | QPS | Throughput | CPU util |
| :----: | :-------: | :-------: | :-----: | :------: | :------: | :---: | :--------: | :------: |
| Photon | io_uring | Rand-read | 4KB | 128 | Yes | 433K | 1.73GB | 100% |
| Photon | libaio | Rand-read | 4KB | 128 | Yes | 346K | 1.38GB | 100% |
| fio | libaio | Rand-read | 4KB | 128 | Yes | 279K | 1.11GB | 100% |

Note that fio only enables 1 job (process).

Expand All @@ -74,44 +74,47 @@ Conclusion: Photon is faster than fio under this circumstance.

Compare TCP socket echo server performance, in descending order.

Client Mode: Streaming
##### 2.1.1 Client Mode: Streaming

| | Language | Concurrency Model | Buffer Size | Conn Num | QPS | Bandwidth | CPU util |
|:---------------------------------------------------------------------:|:--------:|:-------------------:|:-----------:|:--------:|:-----:|:---------:|:--------:|
| :-------------------------------------------------------------------: | :------: | :-----------------: | :---------: | :------: | :---: | :-------: | :------: |
| Photon | C++ | Stackful Coroutine | 512 Bytes | 4 | 1604K | 6.12Gb | 99% |
| [cocoyaxi](https://github.com/idealvin/cocoyaxi) | C++ | Stackful Coroutine | 512 Bytes | 4 | 1545K | 5.89Gb | 99% |
| [tokio](https://tokio.rs/) | Rust | Stackless Coroutine | 512 Bytes | 4 | 1384K | 5.28Gb | 98% |
| [acl/lib_fiber](https://github.com/acl-dev/acl/tree/master/lib_fiber) | C++ | Stackful Coroutine | 512 Bytes | 4 | 1240K | 4.73Gb | 94% |
| Go | Golang | Stackful Coroutine | 512 Bytes | 4 | 1083K | 4.13Gb | 100% |
| [libgo](https://github.com/yyzybb537/libgo) | C++ | Stackful Coroutine | 512 Bytes | 4 | 770K | 2.94Gb | 99% |
| [boost::asio](https://think-async.com/Asio/) | C++ | Async Callback | 512 Bytes | 4 | 634K | 2.42Gb | 97% |
| [monoio](https://github.com/bytedance/monoio) | Rust | Stackless Coroutine | 512 Bytes | 4 | 610K | 2.32Gb | 100% |
| [libco](https://github.com/Tencent/libco) | C++ | Stackful Coroutine | 512 Bytes | 4 | 432K | 1.65Gb | 96% |
| [zab](https://github.com/Donald-Rupin/zab) | C++20 | Stackless Coroutine | 512 Bytes | 4 | 412K | 1.57Gb | 99% |
| [asyncio](https://github.com/netcan/asyncio) | C++20 | Stackless Coroutine | 512 Bytes | 4 | 186K | 0.71Gb | 98% |

Client Mode: Ping-pong

| | Language | Concurrency Model | Buffer Size | Conn Num | QPS | Bandwidth | CPU util |
|:---------------------------------------------------------------------:|:--------:|:-------------------:|:-----------:|:--------:|:----:|:---------:|:--------:|
| Photon | C++ | Stackful Coroutine | 512 Bytes | 1000 | 412K | 1.57Gb | 100% |
| [boost::asio](https://think-async.com/Asio/) | C++ | Async Callback | 512 Bytes | 1000 | 393K | 1.49Gb | 100% |
| [evpp](https://github.com/Qihoo360/evpp) | C++ | Async Callback | 512 Bytes | 1000 | 378K | 1.44Gb | 100% |
| [tokio](https://tokio.rs/) | Rust | Stackless Coroutine | 512 Bytes | 1000 | 365K | 1.39Gb | 100% |
| Go | Golang | Stackful Coroutine | 512 Bytes | 1000 | 331K | 1.26Gb | 100% |
| [acl/lib_fiber](https://github.com/acl-dev/acl/tree/master/lib_fiber) | C++ | Stackful Coroutine | 512 Bytes | 1000 | 327K | 1.25Gb | 100% |
| [swoole](https://github.com/swoole/swoole-src) | PHP | Stackful Coroutine | 512 Bytes | 1000 | 325K | 1.24Gb | 99% |
| [zab](https://github.com/Donald-Rupin/zab) | C++20 | Stackless Coroutine | 512 Bytes | 1000 | 317K | 1.21Gb | 100% |
| [cocoyaxi](https://github.com/idealvin/cocoyaxi) | C++ | Stackful Coroutine | 512 Bytes | 1000 | 279K | 1.06Gb | 98% |
| [libco](https://github.com/Tencent/libco) | C++ | Stackful Coroutine | 512 Bytes | 1000 | 260K | 0.99Gb | 96% |
| [libgo](https://github.com/yyzybb537/libgo) | C++ | Stackful Coroutine | 512 Bytes | 1000 | 258K | 0.98Gb | 156% |
| [asyncio](https://github.com/netcan/asyncio) | C++20 | Stackless Coroutine | 512 Bytes | 1000 | 241K | 0.92Gb | 99% |
| TypeScript | nodejs | Async Callback | 512 Bytes | 1000 | 192K | 0.75Gb | 100% |

<details><summary>More details...</summary><p>
##### 2.1.2 Client Mode: Ping-pong

| | Language | Concurrency Model | Buffer Size | Conn Num | QPS | Bandwidth | CPU util |
| :-------------------------------------------------------------------: | :------: | :-----------------: | :---------: | :------: | :---: | :-------: | :------: |
| Photon | C++ | Stackful Coroutine | 512 Bytes | 1000 | 412K | 1.57Gb | 100% |
| [monoio](https://github.com/bytedance/monoio) | Rust | Stackless Coroutine | 512 Bytes | 1000 | 400K | 1.52Gb | 100% |
| [boost::asio](https://think-async.com/Asio/) | C++ | Async Callback | 512 Bytes | 1000 | 393K | 1.49Gb | 100% |
| [evpp](https://github.com/Qihoo360/evpp) | C++ | Async Callback | 512 Bytes | 1000 | 378K | 1.44Gb | 100% |
| [tokio](https://tokio.rs/) | Rust | Stackless Coroutine | 512 Bytes | 1000 | 365K | 1.39Gb | 100% |
| [netty](https://github.com/netty/netty) | Java | Async Callback | 512 Bytes | 1000 | 340K | 1.30Gb | 99% |
| Go | Golang | Stackful Coroutine | 512 Bytes | 1000 | 331K | 1.26Gb | 100% |
| [acl/lib_fiber](https://github.com/acl-dev/acl/tree/master/lib_fiber) | C++ | Stackful Coroutine | 512 Bytes | 1000 | 327K | 1.25Gb | 100% |
| [swoole](https://github.com/swoole/swoole-src) | PHP | Stackful Coroutine | 512 Bytes | 1000 | 325K | 1.24Gb | 99% |
| [zab](https://github.com/Donald-Rupin/zab) | C++20 | Stackless Coroutine | 512 Bytes | 1000 | 317K | 1.21Gb | 100% |
| [cocoyaxi](https://github.com/idealvin/cocoyaxi) | C++ | Stackful Coroutine | 512 Bytes | 1000 | 279K | 1.06Gb | 98% |
| [libco](https://github.com/Tencent/libco) | C++ | Stackful Coroutine | 512 Bytes | 1000 | 260K | 0.99Gb | 96% |
| [libgo](https://github.com/yyzybb537/libgo) | C++ | Stackful Coroutine | 512 Bytes | 1000 | 258K | 0.98Gb | 156% |
| [asyncio](https://github.com/netcan/asyncio) | C++20 | Stackless Coroutine | 512 Bytes | 1000 | 241K | 0.92Gb | 99% |
| TypeScript | nodejs | Async Callback | 512 Bytes | 1000 | 192K | 0.75Gb | 100% |

<details><summary>CLick to show more details...</summary><p>

- The Streaming client is to measure echo server performance when handling high throughput. A similar scenario in the
real world is the multiplexing technology used by RPC and HTTP 2.0. We will set up 4 client processes,
and each of them will create only one connection. Send coroutine and recv coroutine are running their loops separately.
and each of them will create only one connection. Send coroutine and recv coroutine are running infinite loops separately.
- The Ping-pong client is to measure echo server performance when handling large amounts of connections.
We will set up 10 client processes, and each of them will create 100 connections. For a single connection, it has to send first, then receive.
- Server and client are all cloud VMs, 64Core 128GB, Intel Platinum CPU 2.70GHz. Kernel version is 6.0.7. The network bandwidth (unilateral) is 32Gb.
Expand All @@ -126,10 +129,10 @@ Conclusion: Photon socket has the best per-core QPS.

Compare Photon and Nginx when serving static files, using Apache Bench(ab) as client.

| | File Size | QPS | CPU util |
|:------:|:---------:|:----:|:--------:|
| Photon | 4KB | 114K | 100% |
| Nginx | 4KB | 97K | 100% |
| | File Size | QPS | CPU util |
| :----: | :-------: | :---: | :------: |
| Photon | 4KB | 114K | 100% |
| Nginx | 4KB | 97K | 100% |

Note that Nginx only enables 1 worker (process).

Expand Down

0 comments on commit e693e25

Please sign in to comment.