Support registering benchmark configurations at run-time #13

HadrienG2 · 2023-10-26T05:41:00Z

HadrienG2
Oct 26, 2023
Sponsor

Unless I missed something, divan currently only supports running over a set of configurations that is known at compile time. While this results in a very nice API for simple use cases, having only this option can be a problem in more complex benchmarks for several reasons:

It will lead to severe code bloat issues when benchmarking over a wide range of problem sizes (since problem sizes can currently only be specified via #[bench(consts = ...)], which results in one different copy of the function being generated for each problem size), unless one remembers to dispatch to a utility function with a run-time problem size.
It can lead to undesirable bias when one wants to benchmark how the code behaves in the face of a problem size which is only known at run time (which is the common case), unless one remembers to use the above dynamic dispatch trick or another optimization barrier to hide the problem size from the compiler. In which case suffering the aforementioned code bloat is pointless.
It precludes running over a set of benchmark configurations that cannot be known until runtime.

To give you an example of the latter, I have a set of criterion benchmarks around that exercises parallel code over a range of thread pinning configurations. Given N the host's CPU thread counts, I want to test with N threads, N/2 threads, N/4 threads..., all the way to 1 thread, and for each of these thread counts, I want to test two thread pinning configurations, one "dense" configuration where threads are packed into as few NUMA/NUCA domains as possible (which minimizes synchronization costs), and one "sparse" configuration where threads are spread over as many NUMA/NUCA domains as possible (which maximizes shared resource usage).

This sort of benchmarking cannot be done unless at some point during the benchmark initialization process, I get the occasion to probe the host using something like hwloc, generate benchmark configurations, and register them.

Given the significant heterogeneity of GPU hardware, I suspect that people who try to use divan for benchmarking GPU code will face similar issues in much less exotic use cases.

While a fully general run-time benchmark registration mechanism would require some kind of Divan::add_benchmark() API, I suspect the most common use cases could be covered by just having some sort of Bencher::with_input_configurations(configs: impl IntoIterator<Item = (String, InputGenerator)>) API that lets you run a single benchmark function over N different input configurations and report each configuration separately with a different name in the final divan output.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support registering benchmark configurations at run-time #13

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Support registering benchmark configurations at run-time #13

HadrienG2 Oct 26, 2023 Sponsor

Replies: 0 comments

HadrienG2
Oct 26, 2023
Sponsor