sparkioref

Benchmark the IO performance of Apache Spark (Scala/Python). Currently supported: csv/json, parquet, FITS.

Run the benchmark

Edit the run_benchmark.sh file with your data and cluster configuration, and launch it using

./run_benchmark.sh

Configuration: