Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entropy-based stopping criterion #151

Merged
merged 21 commits into from
Jan 12, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 63 additions & 0 deletions .clangd
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# https://clangd.llvm.org/config

# Apply a config conditionally to all C files
If:
PathMatch: .*\.(c|h)$

---

# Apply a config conditionally to all C++ files
If:
PathMatch: .*\.(c|h)pp

---

# Apply a config conditionally to all CUDA files
If:
PathMatch: .*\.cuh?
CompileFlags:
Add:
# Allow variadic CUDA functions
- "-Xclang=-fcuda-allow-variadic-functions"

---

# Tweak the clangd parse settings for all files
CompileFlags:
Compiler: clang++
CompilationDatabase: .
Add:
- -x
- cuda
# report all errors
- "-ferror-limit=0"
- "-ftemplate-backtrace-limit=0"
- "-stdlib=libc++"
- "-std=c++17"
Remove:
# strip CUDA fatbin args
- "-Xfatbin*"
- "-Xcompiler*"
- "-Xcudafe*"
- "-rdc=*"
- "-gpu=*"
- "--diag_suppress*"
# strip CUDA arch flags
- "-gencode*"
- "--generate-code*"
# strip gcc's -fcoroutines
- -fcoroutines
# strip CUDA flags unknown to clang
- "-ccbin*"
- "--compiler-options*"
- "--expt-extended-lambda"
- "--expt-relaxed-constexpr"
- "-forward-unknown-to-host-compiler"
- "-Werror=cross-execution-space-call"
Diagnostics:
Suppress:
- "variadic_device_fn"
- "attributes_not_allowed"
# The NVHPC version of _NVCXX_EXPAND_PACK macro triggers this clang error.
# Temporarily suppressing it, but should probably fix
- "template_param_shadow"
8 changes: 8 additions & 0 deletions docs/cli_help.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,8 +89,15 @@
* Applies to the most recent `--benchmark`, or all benchmarks if specified
before any `--benchmark` arguments.

* `--stopping-criterion <criterion>`
* After `--min-samples` is satisfied, use `<criterion>` to detect if enough
samples were collected.
* Only applies to Cold measurements.
* Default is stdrel (`--stopping-criterion stdrel`)

* `--min-time <seconds>`
* Accumulate at least `<seconds>` of execution time per measurement.
* Only applies to `stdrel` stopping criterion.
* Default is 0.5 seconds.
* If both GPU and CPU times are gathered, this applies to GPU time only.
* Applies to the most recent `--benchmark`, or all benchmarks if specified
Expand All @@ -100,6 +107,7 @@
* Gather samples until the error in the measurement drops below `<value>`.
* Noise is specified as the percent relative standard deviation.
* Default is 0.5% (`--max-noise 0.5`)
* Only applies to `stdrel` stopping criterion.
* Only applies to Cold measurements.
* If both GPU and CPU times are gathered, this applies to GPU noise only.
* Applies to the most recent `--benchmark`, or all benchmarks if specified
Expand Down
1 change: 1 addition & 0 deletions examples/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ set(example_srcs
stream.cu
throughput.cu
auto_throughput.cu
custom_criterion.cu
gevtushenko marked this conversation as resolved.
Show resolved Hide resolved
)

# Metatarget for all examples:
Expand Down
89 changes: 89 additions & 0 deletions examples/custom_criterion.cu
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
/*
* Copyright 2023 NVIDIA Corporation
*
* Licensed under the Apache License, Version 2.0 with the LLVM exception
* (the "License"); you may not use this file except in compliance with
* the License.
*
* You may obtain a copy of the License at
*
* http://llvm.org/foundation/relicensing/LICENSE.txt
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#include <nvbench/nvbench.cuh>

// Grab some testing kernels from NVBench:
#include <nvbench/test_kernels.cuh>

// Thrust vectors simplify memory management:
#include <thrust/device_vector.h>

// Inherit from the stopping_criterion class:
class fixed_criterion final : public nvbench::stopping_criterion
{
nvbench::int64_t m_max_samples{};
nvbench::int64_t m_num_samples{};

public:
// Setup the criterion in the `initialize()` method:
virtual void initialize(const nvbench::criterion_params &params) override
{
m_num_samples = 0;
m_max_samples = params.has_value("max-samples") ? params.get_int64("max-samples") : 42;
}

// Process new measurements in the `add_measurement()` method:
virtual void add_measurement(nvbench::float64_t /* measurement */) override
{
m_num_samples++;
}

// Check if the stopping criterion is met in the `is_finished()` method:
virtual bool is_finished() override
{
return m_num_samples >= m_max_samples;
}

// Describe criterion parameters in the `get_params()` method:
virtual const params_description &get_params() const override
gevtushenko marked this conversation as resolved.
Show resolved Hide resolved
{
static const params_description desc{
{"max-samples", nvbench::named_values::type::int64}
};
return desc;
}
};

// Register the criterion with NVBench:
static bool registered = //
gevtushenko marked this conversation as resolved.
Show resolved Hide resolved
nvbench::criterion_registry::register_criterion("fixed",
std::make_unique<fixed_criterion>());

void throughput_bench(nvbench::state &state)
{
// Allocate input data:
const std::size_t num_values = 64 * 1024 * 1024 / sizeof(nvbench::int32_t);
thrust::device_vector<nvbench::int32_t> input(num_values);
thrust::device_vector<nvbench::int32_t> output(num_values);

// Provide throughput information:
state.add_element_count(num_values, "NumElements");
state.add_global_memory_reads<nvbench::int32_t>(num_values, "DataSize");
state.add_global_memory_writes<nvbench::int32_t>(num_values);

state.set_stopping_criterion("fixed");

state.exec(nvbench::exec_tag::no_batch, [&input, &output, num_values](nvbench::launch &launch) {
nvbench::copy_kernel<<<256, 256, 0, launch.get_stream()>>>(
thrust::raw_pointer_cast(input.data()),
thrust::raw_pointer_cast(output.data()),
num_values);
});
}
NVBENCH_BENCH(throughput_bench);
gevtushenko marked this conversation as resolved.
Show resolved Hide resolved
4 changes: 4 additions & 0 deletions nvbench/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,14 @@ set(srcs
string_axis.cxx
type_axis.cxx
type_strings.cxx
criterion_registry.cxx
gevtushenko marked this conversation as resolved.
Show resolved Hide resolved
stopping_criterion.cxx

detail/measure_cold.cu
detail/measure_hot.cu
detail/state_generator.cxx
detail/stdrel_criterion.cxx
detail/entropy_criterion.cxx
gevtushenko marked this conversation as resolved.
Show resolved Hide resolved

internal/nvml.cxx
)
Expand Down
33 changes: 25 additions & 8 deletions nvbench/benchmark_base.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
#include <nvbench/device_info.cuh>
#include <nvbench/device_manager.cuh>
#include <nvbench/state.cuh>
#include <nvbench/stopping_criterion.cuh>

#include <functional> // reference_wrapper, ref
#include <memory>
Expand Down Expand Up @@ -181,22 +182,24 @@ struct benchmark_base
}
/// @}

/// Accumulate at least this many seconds of timing data per measurement. @{
[[nodiscard]] nvbench::float64_t get_min_time() const { return m_min_time; }
/// Accumulate at least this many seconds of timing data per measurement.
/// Only applies to `stdrel` stopping criterion. @{
[[nodiscard]] nvbench::float64_t get_min_time() const { return m_criterion_params.get_float64("min-time"); }
gevtushenko marked this conversation as resolved.
Show resolved Hide resolved
benchmark_base &set_min_time(nvbench::float64_t min_time)
{
m_min_time = min_time;
m_criterion_params.set_float64("min-time", min_time);
return *this;
}
/// @}

/// Specify the maximum amount of noise if a measurement supports noise.
/// Noise is the relative standard deviation:
/// `noise = stdev / mean_time`. @{
[[nodiscard]] nvbench::float64_t get_max_noise() const { return m_max_noise; }
/// `noise = stdev / mean_time`.
/// Only applies to `stdrel` stopping criterion. @{
[[nodiscard]] nvbench::float64_t get_max_noise() const { return m_criterion_params.get_float64("max-noise"); }
benchmark_base &set_max_noise(nvbench::float64_t max_noise)
{
m_max_noise = max_noise;
m_criterion_params.set_float64("max-noise", max_noise);
return *this;
}
/// @}
Expand Down Expand Up @@ -230,6 +233,19 @@ struct benchmark_base
}
/// @}

[[nodiscard]] nvbench::criterion_params& get_criterion_params() { return m_criterion_params; }
[[nodiscard]] const nvbench::criterion_params& get_criterion_params() const { return m_criterion_params; }

/// Control the stopping criterion for the measurement loop.
/// @{
[[nodiscard]] const std::string& get_stopping_criterion() const { return m_stopping_criterion; }
benchmark_base &set_stopping_criterion(std::string criterion)
{
m_stopping_criterion = std::move(criterion);
return *this;
}
/// @}

protected:
friend struct nvbench::runner_base;

Expand All @@ -247,12 +263,13 @@ protected:
bool m_disable_blocking_kernel{false};

nvbench::int64_t m_min_samples{10};
nvbench::float64_t m_min_time{0.5};
nvbench::float64_t m_max_noise{0.005}; // 0.5% relative standard deviation

nvbench::float64_t m_skip_time{-1.};
nvbench::float64_t m_timeout{15.};

nvbench::criterion_params m_criterion_params;
std::string m_stopping_criterion{"stdrel"};

private:
// route these through virtuals so the templated subclass can inject type info
virtual std::unique_ptr<benchmark_base> do_clone() const = 0;
Expand Down
7 changes: 4 additions & 3 deletions nvbench/benchmark_base.cxx
Original file line number Diff line number Diff line change
Expand Up @@ -34,13 +34,14 @@ std::unique_ptr<benchmark_base> benchmark_base::clone() const
result->m_axes = m_axes;
result->m_devices = m_devices;

result->m_min_samples = m_min_samples;
result->m_min_time = m_min_time;
result->m_max_noise = m_max_noise;
result->m_min_samples = m_min_samples;
result->m_criterion_params = m_criterion_params;

result->m_skip_time = m_skip_time;
result->m_timeout = m_timeout;

result->m_stopping_criterion = m_stopping_criterion;

return result;
}

Expand Down
49 changes: 49 additions & 0 deletions nvbench/criterion_registry.cuh
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
/*
* Copyright 2023 NVIDIA Corporation
*
* Licensed under the Apache License, Version 2.0 with the LLVM exception
* (the "License"); you may not use this file except in compliance with
* the License.
*
* You may obtain a copy of the License at
*
* http://llvm.org/foundation/relicensing/LICENSE.txt
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#pragma once

#include <nvbench/types.cuh>
gevtushenko marked this conversation as resolved.
Show resolved Hide resolved
#include <nvbench/stopping_criterion.cuh>
#include <nvbench/detail/stdrel_criterion.cuh>
#include <nvbench/detail/entropy_criterion.cuh>

#include <unordered_map>
#include <memory>

namespace nvbench
{

class criterion_registry
gevtushenko marked this conversation as resolved.
Show resolved Hide resolved
{
std::unordered_map<std::string, std::unique_ptr<nvbench::stopping_criterion>> m_map;

criterion_registry();

public:
static criterion_registry &instance();
gevtushenko marked this conversation as resolved.
Show resolved Hide resolved

static nvbench::stopping_criterion* get(const std::string& name);

static bool register_criterion(std::string name,
gevtushenko marked this conversation as resolved.
Show resolved Hide resolved
std::unique_ptr<nvbench::stopping_criterion> criterion);

static nvbench::stopping_criterion::params_description get_params_description();
gevtushenko marked this conversation as resolved.
Show resolved Hide resolved
};

} // namespace nvbench
Loading