Organize benchmarks into suites #202

abrown · 2022-10-05T20:39:25Z

I am proposing in this issue that we add a way for sightglass-cli to benchmark a specific set of files together.

Why

I have noticed that certain users only run the larger benchmarks (spidermonkey, bz2, pulldown-cmark); in the past I have wanted to only run the shootout-* micro-benchmarks or "all the benchmarks that use SIMD."
New benchmarks, like the wasi-nn one submitted in Adding a benchmark for wasi-nn #201, would require special setup and would not run successfully with sightglass-cli benchmark ... benchmarks/*/benchmark.wasm (as the README currently advises the user). These special benchmarks (e.g., due to environment setup, compile-time or runtime flags, etc.) could be separated in to their own suite.
As described in the RFCs on benchmarking, candidate benchmarks should be "real, widely used programs, or at least extracted kernels of such programs." Though this is good criteria, it can be subjective (e.g., though some shootout benchmarks are "extracted kernels" some may consider them too small to be representative); organizing the benchmarks into suites would recognize and document these differences.
sightglass next: if no wasm files are passed to the benchmark subcommand, default to all benchmark programs #71 requests a default set of benchmarks to run automatically if no benchmark files are specified — one could imagine a default suite for that.

How

The core idea would be to create a set of *.suite files. *.suite files would contain a newline-delimited list of benchmark paths relative to the Sightglass project directory and would accept # line comments. E.g., foo.suite could look like:

# This suite contains benchmarks that call the `foo` function.
benchmarks/foo/benchmark.wasm
benchmarks/bar/benchmark.wasm
benchmarks/baz/specially-crafted-benchmark.wasm

sightglass-cli would accept a new flag — --suite — with the path to the *.suite file to use. It would be an error to specify both a suite and a benchmark file; other than that, running benchmarks by path (the current way) would remain unchanged. Users could create their own *.suite files and manage these outside the repository but the Sightglass repository would contain a few *.suite files. There are many ways to slice this, but I could see the following initial set:

default.suite
shootout.suite
simd.suite

If no benchmark paths or suites are specified, sightglass-cli would run the benchmarks contained in the default.suite.

I would appreciate feedback on this proposal before implementing it. Any thoughts one way or another are appreciated!

The text was updated successfully, but these errors were encountered:

abrown · 2022-10-05T20:40:37Z

cc: @fitzgen, @jlb6740, @cfallin, @alexcrichton, @jameysharp, @bnjbvr

jameysharp · 2022-10-06T00:54:47Z

I really like this idea overall!

I'd be tempted to say that .suite files should be accepted as positional parameters just like .wasm files, rather than requiring a --suite option. I don't know how hard that would be to implement, but it seems more convenient to me. It's only a nice-to-have idea, anyway; I'd certainly take the --suite option.

Especially for supporting the use case of out-of-tree suites, I think it's important to specify that paths in a .suite file are relative to the location of that .suite file, not to the current working directory or the Sightglass source tree.

Unrelated to this, it's been bothering me that Sightglass doesn't produce any output until the end. So I wrap a shell loop around it and run one benchmark per cargo run invocation, so at least I can see how each individual benchmark did ASAP. I'd have trouble using this suites feature as a result. There's probably stuff we can do to give useful progress reports.

abrown · 2022-10-06T20:46:56Z

Thanks for the feedback!

I'd be tempted to say that .suite files should be accepted as positional parameters just like .wasm files

Do you mean all positional arguments to the benchmark command are put in the same list and we detect if they are a *.suite or *.wasm file by file extension or first few bytes?

I think it's important to specify that paths in a .suite file are relative to the location of that .suite file

Yup, good point.

Unrelated to this, it's been bothering me that Sightglass doesn't produce any output until the end. [...] There's probably stuff we can do to give useful progress reports.

Yeah, that's a separate problem that I probably introduced when I added the measurement serialization code. IIRC, I had to wait until all measurements were collected to serialize them with serde; e.g., in JSON, each measurement is a part of an array. But that is completely fixable, I think, with some refactoring of that code. (A bit unrelated to suites, though... I opened a separate issue to track that: #203)

jameysharp · 2022-10-06T21:02:03Z

I'd be tempted to say that .suite files should be accepted as positional parameters just like .wasm files

Do you mean all positional arguments to the benchmark command are put in the same list and we detect if they are a *.suite or *.wasm file by file extension or first few bytes?

Yeah, something along those lines. If you specify a few suites and some individual benchmarks, it'd run all of them... but de-duplicate, perhaps?

Like I said, the --suite option is fine too, and it's easier to specify how that would work. I was just throwing out ideas.

Yeah, that's a separate problem that I probably introduced when I added the measurement serialization code. IIRC, I had to wait until all measurements were collected to serialize them with serde; e.g., in JSON, each measurement is a part of an array.

Ah, I wondered why it was that way, but that makes sense!

(A bit unrelated to suites, though... I opened a separate issue to track that: #203)

Absolutely unrelated to suites, this just reminded me of it. Thanks for opening that issue.

@jameysharp

This change closes bytecodealliance#202 by adding a way to group benchmarks into `*.suite` files. This incorporates @jameysharp's comments, e.g., to calculate the benchmark path relative to the suite path and to detect suite files based on their `.suite` extension. Though more suites could be added in the future, the result of this is that suite files can be built (see `benchmarks/shootout.suite`, e.g.) and then run like: ```console $ sightglass-cli benchmark benchmarks/shootout.suite -e ... ```

@jameysharp

This change closes bytecodealliance#202 by adding a way to group benchmarks into `*.suite` files. This incorporates @jameysharp's comments, e.g., to calculate the benchmark path relative to the suite path and to detect suite files based on their `.suite` extension. Though more suites could be added in the future, the result of this is that suite files can be built (see `benchmarks/shootout.suite`, e.g.) and then run like: ```console $ sightglass-cli benchmark benchmarks/shootout.suite -e ... ```

fitzgen · 2022-10-11T15:12:40Z

Alternative to .suite files could be just having different directories of benchmarks i.e. benchmarks/micro/* and benchmarks/macro/*.

abrown · 2022-10-11T15:21:28Z

Alternative to .suite files could be just having different directories of benchmarks i.e. benchmarks/micro/* and benchmarks/macro/*.

That's limited in that benchmarks are contained in a single directory. (I guess we could create a bunch of symlinks but that doesn't work everywhere). One thing I like about the suite idea is the ability to create cross-cutting suites like simd.suite or micro.suite.

@jameysharp

* Organize benchmarks into suites This change closes #202 by adding a way to group benchmarks into `*.suite` files. This incorporates @jameysharp's comments, e.g., to calculate the benchmark path relative to the suite path and to detect suite files based on their `.suite` extension. Though more suites could be added in the future, the result of this is that suite files can be built (see `benchmarks/shootout.suite`, e.g.) and then run like: ```console $ sightglass-cli benchmark benchmarks/shootout.suite -e ... ``` * review: avoid ignoring buffered read errors * review: return `Vec<PathBuf>` from `BenchmarkOrSuite::paths` * review: ensure `execute_in_multiple_processes` assigns a single benchmark to each subprocess * review: document Windows canonicalization concerns * review: suites are files

abrown mentioned this issue Oct 6, 2022

Emit measurement results immediately #203

Open

abrown mentioned this issue Oct 10, 2022

Organize benchmarks into suites #204

Merged

abrown closed this as completed in #204 Oct 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Organize benchmarks into suites #202

Organize benchmarks into suites #202

abrown commented Oct 5, 2022

abrown commented Oct 5, 2022

jameysharp commented Oct 6, 2022

abrown commented Oct 6, 2022 •

edited

Loading

jameysharp commented Oct 6, 2022

fitzgen commented Oct 11, 2022

abrown commented Oct 11, 2022

Organize benchmarks into suites #202

Organize benchmarks into suites #202

Comments

abrown commented Oct 5, 2022

Why

How

abrown commented Oct 5, 2022

jameysharp commented Oct 6, 2022

abrown commented Oct 6, 2022 • edited Loading

jameysharp commented Oct 6, 2022

fitzgen commented Oct 11, 2022

abrown commented Oct 11, 2022

abrown commented Oct 6, 2022 •

edited

Loading