Merge pull request #18 from ohbarye/benchmark

Add benchmark script and diagnosis
ohbarye · Apr 14, 2024 · d121c10 · d121c10
2 parents 8d20ef4 + 1262335
commit d121c10
Show file tree

Hide file tree

Showing 11 changed files with 664 additions and 26 deletions.
diff --git a/Gemfile b/Gemfile
@@ -8,3 +8,4 @@ gem "rake"
 gem "rspec"
 gem "standard"
 gem "parallel"
+gem "benchmark-ips"
diff --git a/README.md b/README.md
@@ -4,7 +4,7 @@
 [![Build Status](https://github.com/ohbarye/pbt/actions/workflows/main.yml/badge.svg)](https://github.com/ohbarye/pbt/actions/workflows/main.yml)
 [![RubyDoc](https://img.shields.io/badge/%F0%9F%93%9ARubyDoc-documentation-informational.svg)](https://www.rubydoc.info/gems/pbt)
 
-An experimental property-based testing tool for Ruby that allows you to run test cases in parallel.
+A property-based testing tool for Ruby with experimental features that allow you to run test cases in parallel.
 
 PBT stands for Property-Based Testing.
 
@@ -148,8 +148,8 @@ Pbt.configure do |config|
   # Whether to print verbose output. Default is `false`.
   config.verbose = 100
 
-  # The concurrency method to use. `:ractor`, `:thread`, `:process` and `:none` are supported. Default is `:ractor`.
-  config.worker = :ractor
+  # The concurrency method to use. `:ractor`, `:thread`, `:process` and `:none` are supported. Default is `:none`.
+  config.worker = :none
 
   # The number of runs to perform. Default is `100`.
   config.num_runs = 100
@@ -172,16 +172,20 @@ Pbt.assert(num_runs: 100, seed: 42) do
 end
 ```
 
-## Concurrent methods
+## Concurrency methods
 
 One of the key features of `Pbt` is its ability to rapidly execute test cases in parallel or concurrently, using a large number of values (by default, `100`) generated by `Arbitrary`.
 
 For concurrent processing, you can specify any of the three workers—`:ractor`, `:process`, or `:thread`—using the `worker` option. Alternatively, choose `:none` for serial execution.
 
 `Pbt` supports 3 concurrency methods and 1 sequential one. You can choose one of them by setting the `worker` option.
 
+Be aware that the performance of each method depends on the test subject. For example, if the test subject is CPU-bound, `:ractor` may be the best choice. Otherwise, `:none` shall be the best choice for most cases. See [benchmarks](benchmark/README.md).
+
 ### Ractor
 
+`:ractor` worker is useful for test cases that are CPU-bound. But it's experimental and has some limitations as described below. If you encounter any issues due to those limitations, consider using `:process` as workers whose benchmark is the most similar to `:ractor`.
+
 ```ruby
 Pbt.assert(worker: :ractor) do
   Pbt.property(Pbt.integer) do |n|
@@ -222,6 +226,8 @@ end
 
 ### Process
 
+If you'd like to run test cases that are CPU-bound and `:ractor` is not available, `:process` becomes a good choice.
+
 ```ruby
 Pbt.assert(worker: :process) do
   Pbt.property(Pbt.integer) do |n|
@@ -232,6 +238,8 @@ end
 
 ### Thread
 
+You may not need to run test cases with multi-threads.
+
 ```ruby
 Pbt.assert(worker: :thread) do
   Pbt.property(Pbt.integer) do |n|
@@ -242,6 +250,8 @@ end
 
 ### None
 
+For most cases, `:none` is the best choice. It runs tests sequentially (without parallelism) but most test cases finishes within a reasonable time.
+
 ```ruby
 Pbt.assert(worker: :none) do
   Pbt.property(Pbt.integer) do |n|
@@ -266,10 +276,10 @@ Once this project finishes the following, we will release v1.0.0.
   - [x] Add better examples
   - [x] Arbitrary usage
   - [x] Configuration
+- [x] Benchmark
 - [ ] Rich report like verbose mode
 - [ ] Allow to use expectations and matchers provided by test framework in Ractor if possible.
   - It'd be so hard to pass assertions like `expect`, `assert` to a Ractor.
-- [ ] Benchmark
 - [ ] More parallelism or faster execution if possible
 
 ## Development

diff --git a/Rakefile b/Rakefile
@@ -8,3 +8,51 @@ RSpec::Core::RakeTask.new(:spec)
 require "standard/rake"
 
 task default: %i[spec standard]
+
+namespace :benchmark do
+  task all: ["success:simple", "success:cpu_bound", "success:io_bound", "failure:simple"]
+
+  namespace :success do
+    task :simple do
+      puts "### Benchmark success:simple"
+      puts
+      puts "This runs a script that does not do any IO or CPU bound work."
+      puts
+      ENV["RUBYOPT"] = "-W:no-experimental"
+      sh "ruby", "benchmark/success_simple.rb"
+      puts
+    end
+
+    task :cpu_bound do
+      puts "### Benchmark success:cpu_bound"
+      puts
+      puts "This runs a script that does CPU bound work."
+      puts
+      ENV["RUBYOPT"] = "-W:no-experimental"
+      sh "ruby", "benchmark/success_cpu_bound.rb"
+      puts
+    end
+
+    task :io_bound do
+      puts "### Benchmark success:io_bound"
+      puts
+      puts "This runs a script that does IO bound work."
+      puts
+      ENV["RUBYOPT"] = "-W:no-experimental"
+      sh "ruby", "benchmark/success_io_bound.rb"
+      puts
+    end
+  end
+
+  namespace :failure do
+    task :simple do
+      puts "### Benchmark failure:simple"
+      puts
+      puts "This runs a script that fails and shrink happens."
+      puts
+      ENV["RUBYOPT"] = "-W:no-experimental"
+      sh "ruby", "benchmark/failure_simple.rb"
+      puts
+    end
+  end
+end
diff --git a/benchmark/README.md b/benchmark/README.md
@@ -0,0 +1,97 @@
+# Benchmark of concurrency methods
+
+This benchmark compares the performance of different concurrency methods that `Pbt` provides.
+
+## Usage
+
+```shell
+bundle exec rake benchmark:all
+```
+
+## Diagnosis
+
+Based on benchmark results, it can be said that for most test cases, which are neither I/O-bound nor CPU-bound, concurrency is unnecessary. The overhead of implementing multithreading or multiprocessing outweighs the benefits. For the majority of users, the best strategy is to use `worker: :none`.
+
+However, when the test subject involves CPU-bound processes, `worker: :ractor` emerges as the champion. That's because threads across Ractors run in parallel. It outperforms multithreading, which due to the GVL, only offers performance equivalent to serial processing, and it does so without the overhead associated with multiprocessing.
+
+Interestingly, both multi-process (`worker: :process`) and multi-thread (`worker: :thread`) failed to emerge as the champion in any case.
+
+## Benchmarks
+
+The following benchmarks are the results of running the benchmark suite.
+
+- macOS 13.3.1, Apple M1 Pro 10 cores (8 performance and 2 efficiency)
+- ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin22]
+- pbt commit hash 6582b27105ef5e92197b3f52f9c7cf78d731e1e2
+
+---
+
+### Benchmark success:simple
+
+This runs a script that does not do any IO or CPU bound work.
+
+ruby benchmark/success_simple.rb
+ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin22]
+Warming up --------------------------------------
+ractor    20.000 i/100ms
+process     3.000 i/100ms
+thread   126.000 i/100ms
+none   668.000 i/100ms
+Calculating -------------------------------------
+ractor    173.918 (±11.5%) i/s -    880.000 in   5.129007s
+process     28.861 (± 3.5%) i/s -    147.000 in   5.100393s
+thread      1.130k (± 5.5%) i/s -      5.670k in   5.031552s
+none      6.534k (± 2.3%) i/s -     32.732k in   5.011885s
+
+### Benchmark success:cpu_bound
+
+This runs a script that does CPU bound work.
+
+ruby benchmark/success_cpu_bound.rb
+Call tarai function with(9, 4, 0)
+
+ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin22]
+Warming up --------------------------------------
+ractor     3.000 i/100ms
+process     2.000 i/100ms
+thread     1.000 i/100ms
+none     1.000 i/100ms
+Calculating -------------------------------------
+ractor     32.788 (± 6.1%) i/s -    165.000 in   5.057492s
+process     22.098 (± 4.5%) i/s -    112.000 in   5.080410s
+thread      7.439 (± 0.0%) i/s -     38.000 in   5.108195s
+none      7.494 (± 0.0%) i/s -     38.000 in   5.070547s
+
+### Benchmark success:io_bound
+
+This runs a script that does IO bound work.
+
+ruby benchmark/success_io_bound.rb
+ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin22]
+Warming up --------------------------------------
+ractor    11.000 i/100ms
+process     3.000 i/100ms
+thread    17.000 i/100ms
+none    22.000 i/100ms
+Calculating -------------------------------------
+ractor     82.488 (±14.5%) i/s -    407.000 in   5.054559s
+process     35.403 (± 5.6%) i/s -    177.000 in   5.013818s
+thread    143.022 (± 7.7%) i/s -    714.000 in   5.021129s
+none    223.252 (± 9.0%) i/s -      1.122k in   5.071176s
+
+### Benchmark failure:simple
+
+This runs a script that fails and shrink happens.
+
+ruby benchmark/failure_simple.rb
+ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin22]
+Warming up --------------------------------------
+ractor     6.000 i/100ms
+process     1.000 i/100ms
+thread     9.000 i/100ms
+none   815.000 i/100ms
+Calculating -------------------------------------
+ractor     62.770 (±15.9%) i/s -    306.000 in   5.009858s
+process      1.783 (± 0.0%) i/s -      9.000 in   5.049606s
+thread     85.218 (± 9.4%) i/s -    423.000 in   5.007178s
+none      5.387k (± 3.3%) i/s -     27.710k in   5.149867s