This project has the goal to replicate the great results that @ebarlas achieved in his Project Loom Comparison with Microhttp but with the more widespread stack of Spring Boot, Project Reactor, Netty, Jetty, and Tomcat.
It compares different methods for achieving scalable concurrency with minimal Spring Boot apps:
- Platform OS threads used in a Servlet app written with Spring Web MVC.
- Virtual threads used in a Servlet app written with Spring Web MVC.
- Asynchronous programming used in a reactive app written with Spring Webflux.
A very similar model as in Project Loom Comparison is used.
A user sends requests against a frontend server, which sends three HTTP requests in succession to a backend server. Each of the latter two backend calls requires context the previous call. Each backend call introduces 300 ms second of latency. Thus, the target latency from user to frontend service is 900 ms. The server latency is overwhelmingly due to wait time and threading overhead.
As Cay Horstmann very nicely shows in a recent
blog post, virtual
threads cannot effectively be used with Tomcat at the moment. Broadly speaking,
the problem is that with Tomcat there is blocking I/O in synchronized
blocks,
which pins the virtual threads to their carrier threads. In the sample
application in this repository, this limits the number of concurrent requests
to the size of the fork join pool.
Virtual Threads can however be used with Jetty if a suitable instance of
ThreadPool is provided as
Mark Reinhold showed at Devoxx UK 2019.
The same ThreadPool
is used in the sample application in this repository and
seems to work reasonably well with early access builds of JDK 19.
Experiments were conducted on EC2 instances:
- 1 instance of type c5.4xlarge (16 vCPUs / 32 GB RAM)
- 2 instances of type c5.2xlarge (8 vCPUs / 16 GB RAM)
- Amazon Linux 2 with Linux Kernel 5.10, AMI ami-09439f09c55136ecf
- Eclipse Temurin JDK 17.0.3
- Open JDK 19 Early Access Build 22 from https://jdk.java.net/19/
The applications are written with
- Spring 2.6.8 (including its dependency management except for Jetty)
- Jetty 10.0.9 (instead of Jetty 9 which Spring Boot 2.x suggests only due to its Java 8 baseline)
- Gradle 7.4.1
With JDK 17 as command line default, the fat jars of the applications can be
built with ./gradlew bootJar
and an optional --no-daemon
for remote hosts.
The Gradle wrapper will download the appropriate Gradle version and auto-detect
JDK 19 for its toolchain if the JDK has been downloaded and placed in one of
the usual paths.
ApacheBench is an HTTP benchmarking tool that is packaged with the Apache HTTP server. It sends the indicated number of requests continuously using a specified number of persistent connections.
The following concurrency levels and workloads were tested from one of the c5.2xlarge instances:
-c 1000 -n 120000
-c 5000 -n 600000
-c 10000 -n 1200000
-c 15000 -n 1800000
-c 20000 -n 2400000
with commands such as
ab -kl -c 20000 -n 2400000 http://172.31.4.14:8080/capabilities
To allow Apache Bench to send a high number of concurrent requests, the ephemeral port range of the host was extended by setting
net.ipv4.ip_local_port_range = 9000 65535
in /etc/sysctl.conf
.
The frontend web server receives connections and requests from ApacheBench. For each request received, it makes three calls in succession to the backend web server. Each backend call has a configured latency of 300 ms, so the target latency at the frontend web server is 900 ms.
It's implemented in three different flavors has been run on the larger c5.4xlarge instance.
- Servlet app with Spring Web MVC and platform threads
- Servlet app with Spring Web MVC and Loom's virtual threads
- Reactive with Spring Webflux and Netty
The servlet frontend was compiled and run with the early access JDK 19 build, and given an ample amount of threads when running with platform threads.
java -Xmx12g -jar servlet-frontend/build/libs/servlet-frontend.jar --server.jetty.threads.max=25000 --backend.url=http://<backend ip>:8080
The servlet frontend was compiled and run with the early access JDK 19 build, and given an ample amount of threads:
java --enable-preview -Xmx12g -jar servlet-frontend/build/libs/servlet-frontend.jar --spring.profiles.active=loom --backend.url=http://<backend ip>:8080
The reactive frontend was compiled and run with JDK 17.0.3.
java -Xmx12g -jar reactive-frontend/build/libs/reactive-frontend.jar --backend.url=http://<backend ip>:8080
The backend web server receives connections and requests from the frontend web server. It responds to each request after a configured delay of 300 ms.
It's implemented as a Spring Boot application with Webflux and Netty. It was compiled and run with JDK 17.0.3 on one of the two c5.2xlarge instances.
java -Xmx12g -jar reactive-backend/build/libs/reactive-backend.jar
The following response metrics are taken from the Apache Bench output of representative runs. Times are in milliseconds if not stated otherwise.
Note that the figures cannot be directly compared with those generated from Mircohttp in Project Loom Comparison as that comparison used a c5.2xlarge instance also for the frontend. Unfortunately, the servlet apps were unstable on that machine size for under high load.
Concurreny Level | 20,000 | 15,000 | 10,000 | 5,000 | 1,000 |
---|---|---|---|---|---|
Requests | 2,400,000 | 1,800,000 | 1,200,000 | 600,000 | 120,000 |
Time taken for tests [s] | 157.658 | 115.331 | 111.523 | 109.883 | 109.429 |
Requests per second | 15222.83 | 15607.22 | 10760.11 | 5460.33 | 1096.6 |
Time per request | 1313.817 | 961.094 | 929.358 | 915.695 | 911.907 |
Min. total connection time | 902 | 902 | 902 | 902 | 902 |
Mean total connection time | 1277 | 949 | 917 | 905 | 904 |
Standard deviation | 1347 | 64.8 | 36.3 | 12.8 | 5.6 |
50% percentile | 1133 | 925 | 905 | 903 | 903 |
66% percentile | 1207 | 944 | 909 | 903 | 903 |
75% percentile | 1258 | 961 | 915 | 903 | 904 |
80% percentile | 1292 | 974 | 920 | 904 | 904 |
90% percentile | 1387 | 1017 | 940 | 906 | 904 |
95% percentile | 1475 | 1064 | 970 | 914 | 905 |
98% percentile | 1603 | 1142 | 1039 | 938 | 917 |
99% percentile | 1792 | 1253 | 1119 | 976 | 944 |
100% percentile | 17918 | 1750 | 1433 | 1089 | 1133 |
Concurreny Level | 20,000 | 15,000 | 10,000 | 5,000 | 1,000 |
---|---|---|---|---|---|
Requests | 2,400,000 | 1,800,000 | 1,200,000 | 600,000 | 120,000 |
Time taken for tests [s] | 118.725 | 113.172 | 110.606 | 109.662 | 109.355 |
Requests per second | 20214.73 | 15904.95 | 10849.33 | 5471.34 | 1097.34 |
Time per request | 989.378 | 943.102 | 921.716 | 913.854 | 911.293 |
Min. total connection time | 902 | 902 | 902 | 902 | 902 |
Mean total connection time | 977 | 931 | 910 | 905 | 904 |
Standard deviation | 56 | 37.3 | 20.7 | 9.1 | 2.6 |
50% percentile | 967 | 919 | 903 | 902 | 903 |
66% percentile | 987 | 929 | 904 | 903 | 903 |
75% percentile | 1001 | 940 | 906 | 903 | 904 |
80% percentile | 1011 | 947 | 909 | 904 | 904 |
90% percentile | 1039 | 969 | 923 | 906 | 905 |
95% percentile | 1068 | 989 | 937 | 914 | 906 |
98% percentile | 1117 | 1018 | 958 | 925 | 914 |
99% percentile | 1204 | 1056 | 999 | 942 | 918 |
100% percentile | 1552 | 1415 | 1195 | 1047 | 929 |
Concurreny Level | 20,000 | 15,000 | 10,000 | 5,000 | 1,000 |
---|---|---|---|---|---|
Requests | 2,400,000 | 1,800,000 | 1,200,000 | 600,000 | 120,000 |
Time taken for tests [s] | 136.547 | 115.811 | 110.788 | 109.706 | 109.612 |
Requests per second | 17576.43 | 15542.62 | 10831.49 | 5469.16 | 1094.77 |
Time per request | 1137.888 | 965.088 | 923.234 | 914.217 | 913.437 |
Min. total connection time | 902 | 902 | 902 | 902 | 902 |
Mean total connection time | 1125 | 953 | 912 | 905 | 904 |
Standard deviation | 128.8 | 63.2 | 28.6 | 12.6 | 3 |
50% percentile | 1106 | 934 | 904 | 903 | 903 |
66% percentile | 1151 | 953 | 906 | 903 | 903 |
75% percentile | 1183 | 968 | 909 | 903 | 903 |
80% percentile | 1205 | 978 | 912 | 903 | 904 |
90% percentile | 1274 | 1012 | 927 | 907 | 904 |
95% percentile | 1347 | 1054 | 941 | 914 | 906 |
98% percentile | 1461 | 1148 | 964 | 926 | 912 |
99% percentile | 1565 | 1256 | 1043 | 963 | 916 |
100% percentile | 2675 | 1551 | 1342 | 1111 | 1163 |
TBA