Skip to content

Running | Macrobenchmarks

Vlad Ureche edited this page Jun 13, 2014 · 7 revisions

As part of implementing the miniboxing project, we prepared a set of macrobenchmarks based on a mockup of the Scala collections library. The results are encouraging, with speedups between 1.2x and going up to 10x for very large collections, where the garbage collector needs to kick in.

The benchmarking target is motivated by the fact that Scala collections use many of the important patterns in the Scala collections library, and being able to correctly transform it and speed it up proves that the miniboxing plugin is able to correctly interact with language features such as higher-kinded types, closures, implicits and type classes. The second reason for picking Scala collections as a target is that we have received constant complaints from the Scala community about collection performance.

We will not describe the patterns used in collections in this page, nor how miniboxing handles them. Instead, we will refer the reader to the "Improving the Performance of Scala Collections with Miniboxing" paper which explains all the details.

The results for the miniboxing transformation are stored on the website. Instead of pasting the results and the benchmark in the wiki, we would like to direct the reader to the scala-miniboxing.org website to consult the benchmark results.

Reproducing the macrobenchmarks

Reproducing the macrobenchmark numbers in the DRT virtual machine requires a series of tweaks to the testing infrastructure, to account for the very noisy environment:

In the file tests/lib-bench/test/miniboxing/benchmarks/launch/SimpleLinkedList.scala, please make the following adjustments:

In class TestConfiguration, please make the following changes:

  • replace the testSettings value by:
  def testSettings =
    Seq[KeyValue](
      exec.benchRuns -> 10,
      exec.minWarmupRuns -> 10,
      exec.minWarmupRuns -> 20,
      exec.independentSamples -> 10,     // more samples => less overall noise
      exec.outliers.suspectPercent -> 0, // don't eliminate anything, as we expect 
                                         // important noise in the virtual machine
      exec.jvmflags -> "-Xmx2g -Xms2g -Xss4m" // make sure we don't count GC cycles
    )
  • replace the sizes by:
  val sizes = {
    Gen.range("size")(from = 100000, upto = 500000, hop = 100000)
  }
  • and change the executor value to:
  @transient lazy val executor = SeparateJvmsExecutor(
    Executor.Warmer.Default(),
    Aggregator.average,
    new Executor.Measurer.Default
  )

When running benchmarks, please make sure to exit the Eclipse IDE and all other programs. Normally, we run such benchmarks on servers that are only used for benchmarking and where all non-necessary services have been stopped. Still, to run in the virtual machine:

$ cd ~/Workspace/miniboxing-plugin/
$ sbt miniboxing-lib-bench/test
[info] Loading project definition from /home/acc/Workspace/miniboxing-plugin/project
[info] Set current project to miniboxing (in build file:/home/acc/Workspace/miniboxing-plugin/)
Starting miniboxed benchmark. Lay back, it might take a few minutes to stabilize...
  MiniboxedBenchmark$.Least Squares Method with List[Double]  : Parameters(size -> 100000)    :   10.61267
  MiniboxedBenchmark$.Least Squares Method with List[Double]  : Parameters(size -> 200000)    :   22.01880
  MiniboxedBenchmark$.Least Squares Method with List[Double]  : Parameters(size -> 300000)    :   32.56241
  MiniboxedBenchmark$.Least Squares Method with List[Double]  : Parameters(size -> 400000)    :   44.57040
  MiniboxedBenchmark$.Least Squares Method with List[Double]  : Parameters(size -> 500000)    :   56.37448
Starting generic benchmark. Lay back, it might take a few minutes to stabilize...
  GenericBenchmark$.Least Squares Method with List[Double]    : Parameters(size -> 100000)    :   13.76631
  GenericBenchmark$.Least Squares Method with List[Double]    : Parameters(size -> 200000)    :   27.24536
  GenericBenchmark$.Least Squares Method with List[Double]    : Parameters(size -> 300000)    :   41.51236
  GenericBenchmark$.Least Squares Method with List[Double]    : Parameters(size -> 400000)    :   54.68337
  GenericBenchmark$.Least Squares Method with List[Double]    : Parameters(size -> 500000)    :   68.09219
Starting specialized benchmark. Lay back, it might take a few minutes to stabilize...
  SpecializedBenchmark$.Least Squares Method with List[Double]: Parameters(size -> 100000)    :   14.32197
  SpecializedBenchmark$.Least Squares Method with List[Double]: Parameters(size -> 200000)    :   28.99728
  SpecializedBenchmark$.Least Squares Method with List[Double]: Parameters(size -> 300000)    :   43.44664
  SpecializedBenchmark$.Least Squares Method with List[Double]: Parameters(size -> 400000)    :   57.78390
  SpecializedBenchmark$.Least Squares Method with List[Double]: Parameters(size -> 500000)    :   72.84730
[info] Passed: Total 0, Failed 0, Errors 0, Passed 0

We tested the same exact benchmark on the host laptop machine and obtained:

$ sbt miniboxing-lib-bench/test
[info] Loading project definition from /mnt/data1/Work/Workspace/dev/miniboxing-plugin/project
[info] Set current project to miniboxing (in build file:/mnt/data1/Work/Workspace/dev/miniboxing-plugin/)
Starting miniboxed benchmark. Lay back, it might take a few minutes to stabilize...
  MiniboxedBenchmark$.Least Squares Method with List[Double]  : Parameters(size -> 100000)    :   13.08903
  MiniboxedBenchmark$.Least Squares Method with List[Double]  : Parameters(size -> 200000)    :   22.85883
  MiniboxedBenchmark$.Least Squares Method with List[Double]  : Parameters(size -> 300000)    :   32.35949
  MiniboxedBenchmark$.Least Squares Method with List[Double]  : Parameters(size -> 400000)    :   43.04268
  MiniboxedBenchmark$.Least Squares Method with List[Double]  : Parameters(size -> 500000)    :   53.54579
Starting generic benchmark. Lay back, it might take a few minutes to stabilize...
  GenericBenchmark$.Least Squares Method with List[Double]    : Parameters(size -> 100000)    :   19.69286
  GenericBenchmark$.Least Squares Method with List[Double]    : Parameters(size -> 200000)    :   34.28938
  GenericBenchmark$.Least Squares Method with List[Double]    : Parameters(size -> 300000)    :   52.53034
  GenericBenchmark$.Least Squares Method with List[Double]    : Parameters(size -> 400000)    :   68.12464
  GenericBenchmark$.Least Squares Method with List[Double]    : Parameters(size -> 500000)    :   84.00230
Starting specialized benchmark. Lay back, it might take a few minutes to stabilize...
  SpecializedBenchmark$.Least Squares Method with List[Double]: Parameters(size -> 100000)    :   19.71082
  SpecializedBenchmark$.Least Squares Method with List[Double]: Parameters(size -> 200000)    :   35.25171
  SpecializedBenchmark$.Least Squares Method with List[Double]: Parameters(size -> 300000)    :   52.22855
  SpecializedBenchmark$.Least Squares Method with List[Double]: Parameters(size -> 400000)    :   68.55251
  SpecializedBenchmark$.Least Squares Method with List[Double]: Parameters(size -> 500000)    :   85.10621
[info] Passed: Total 0, Failed 0, Errors 0, Passed 0

On the server we use for benchmarking (4-core i7-4770 CPU @ 3.40GHz, 32GB RAM, JVM version 1.8.0_05) we obtained the following results:

$ sbt miniboxing-lib-bench/test
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=384m; support was removed in 8.0
[info] Loading project definition from /localhome/agenet/workspace/dev/miniboxing-plugin/project
[info] Set current project to miniboxing (in build file:/localhome/agenet/workspace/dev/miniboxing-plugin/)
Starting miniboxed benchmark. Lay back, it might take a few minutes to stabilize...
  MiniboxedBenchmark$.Least Squares Method with List[Double]  : Parameters(size -> 100000)    :   15.73296
  MiniboxedBenchmark$.Least Squares Method with List[Double]  : Parameters(size -> 200000)    :   32.84600
  MiniboxedBenchmark$.Least Squares Method with List[Double]  : Parameters(size -> 300000)    :   50.41975
  MiniboxedBenchmark$.Least Squares Method with List[Double]  : Parameters(size -> 400000)    :   67.17152
  MiniboxedBenchmark$.Least Squares Method with List[Double]  : Parameters(size -> 500000)    :   84.56246
Starting generic benchmark. Lay back, it might take a few minutes to stabilize...
  GenericBenchmark$.Least Squares Method with List[Double]    : Parameters(size -> 100000)    :   27.99638
  GenericBenchmark$.Least Squares Method with List[Double]    : Parameters(size -> 200000)    :   55.52710
  GenericBenchmark$.Least Squares Method with List[Double]    : Parameters(size -> 300000)    :   83.30517
  GenericBenchmark$.Least Squares Method with List[Double]    : Parameters(size -> 400000)    :  111.65736
  GenericBenchmark$.Least Squares Method with List[Double]    : Parameters(size -> 500000)    :  137.92301
Starting specialized benchmark. Lay back, it might take a few minutes to stabilize...
  SpecializedBenchmark$.Least Squares Method with List[Double]: Parameters(size -> 100000)    :   27.37729
  SpecializedBenchmark$.Least Squares Method with List[Double]: Parameters(size -> 200000)    :   54.72231
  SpecializedBenchmark$.Least Squares Method with List[Double]: Parameters(size -> 300000)    :   80.81345
  SpecializedBenchmark$.Least Squares Method with List[Double]: Parameters(size -> 400000)    :  108.65019
  SpecializedBenchmark$.Least Squares Method with List[Double]: Parameters(size -> 500000)    :  135.81562
[info] Passed: Total 0, Failed 0, Errors 0, Passed 0

Conclusion

We have presented the macrobenchmarks we have done on the miniboxing plugin, which show speedups of 10 to 60% for different setups. We also obtain better speedups compared to the specialization transformation, which is loses optimality when dealing with complex language features.

In the early stages of the project, we also ran microbenchmarks for miniboxing that allowed us to converge on the code patterns that the miniboxing plugin currently generates.

Next Steps

You can continue with the following resources: