Skip to content

Commit

Permalink
Add CPU masking/pinning option
Browse files Browse the repository at this point in the history
Add an option to specify which CPU cores the replayer is able to use.

Change-Id: I1ff25355dbcc1da34830b8926dc5eb7078889ec5
  • Loading branch information
marius-pelegrin-arm committed Dec 19, 2024
1 parent 8e66328 commit 12e7421
Show file tree
Hide file tree
Showing 8 changed files with 147 additions and 6 deletions.
11 changes: 9 additions & 2 deletions USAGE_android.md
Original file line number Diff line number Diff line change
Expand Up @@ -700,7 +700,7 @@ queryable permission to apply.
The `gfxrecon.py replay` command has the following usage:

```text
usage: gfxrecon.py replay [-h] [--push-file LOCAL_FILE] [--version] [--pause-frame N]
usage: gfxrecon.py replay [-h] [--push-file LOCAL_FILE] [--version] [--cpu-mask <binary-mask>] [--pause-frame N]
[--paused] [--screenshot-all] [--screenshots RANGES]
[--screenshot-format FORMAT] [--screenshot-dir DIR]
[--screenshot-prefix PREFIX] [--screenshot-scale SCALE]
Expand Down Expand Up @@ -749,6 +749,13 @@ optional arguments:
-p LOCAL_FILE, --push-file LOCAL_FILE
Local file to push to the location on device specified
by <file>
--cpu-mask <binary-mask>
Set of CPU cores used by the replayer.
`binary-mask` is a succession of '0' and '1' that specifies
used/unused cores. For example '1010' activates the first and
third cores and deactivate all other cores.
If the option is not set, all cores can be used. If the option
is set only for some cores, the other cores are not used.
--screenshot-all Generate screenshots for all frames. When this option
is specified, --screenshots is ignored (forwarded to
replay tool)
Expand Down Expand Up @@ -792,7 +799,7 @@ optional arguments:
See gfxrecon-extract.
--opcd, --omit-pipeline-cache-data
Omit pipeline cache data from calls to
vkCreatePipelineCache and skip calls to
vkCreatePipelineCache and skip calls to--cpu-mask <binary-mask>
vkGetPipelineCacheData (forwarded to replay tool)
--surface-index N Restrict rendering to the Nth surface object created.
Used with captures that include multiple surfaces.
Expand Down
9 changes: 8 additions & 1 deletion USAGE_desktop_D3D12.md
Original file line number Diff line number Diff line change
Expand Up @@ -197,7 +197,7 @@ The `gfxrecon-replay` tool accepts the following command line arguments:
gfxrecon-replay.exe - A tool to replay GFXReconstruct capture files.
Usage:
gfxrecon-replay.exe [-h | --help] [--version] [--gpu <index>]
gfxrecon-replay.exe [-h | --help] [--version] [--cpu-mask <binary-mask>] [--gpu <index>]
[--pause-frame <N>] [--paused] [--sync] [--screenshot-all]
[--screenshots <N1(-N2),...>] [--screenshot-format <format>]
[--screenshot-dir <dir>] [--screenshot-prefix <file-prefix>]
Expand Down Expand Up @@ -268,6 +268,13 @@ Optional arguments:
--validate Enables the Khronos Vulkan validation layer when replaying a
Vulkan capture or the Direct3D debug layer when replaying a
Direct3D 12 capture.
--cpu-mask <binary-mask>
Set of CPU cores used by the replayer.
`binary-mask` is a succession of '0' and '1' that specifies
used/unused cores. For example '1010' activates the first and
third cores and deactivate all other cores.
If the option is not set, all cores can be used. If the option
is set only for some cores, the other cores are not used.
--gpu <index> Use the specified device for replay, where index
is the zero-based index to the array of physical devices
returned by vkEnumeratePhysicalDevices or IDXGIFactory1::EnumAdapters1.
Expand Down
9 changes: 8 additions & 1 deletion USAGE_desktop_Vulkan.md
Original file line number Diff line number Diff line change
Expand Up @@ -546,7 +546,7 @@ The `gfxrecon-replay` tool for desktop accepts the following command line
arguments:

```text
gfxrecon-replay [-h | --help] [--version] [--gpu <index>]
gfxrecon-replay [-h | --help] [--version] [--cpu-mask <binary-mask>] [--gpu <index>]
[--pause-frame <N>] [--paused] [--sync] [--screenshot-all]
[--screenshots <N1(-N2),...>] [--screenshot-format <format>]
[--screenshot-dir <dir>] [--screenshot-prefix <file-prefix>]
Expand Down Expand Up @@ -587,6 +587,13 @@ Optional arguments:
--log-file <file> Write log messages to a file at the specified path.
Default is: Empty string (file logging disabled).
--log-debugview Log messages with OutputDebugStringA. Windows only.
--cpu-mask <binary-mask>
Set of CPU cores used by the replayer.
`binary-mask` is a succession of '0' and '1' that specifies
used/unused cores. For example '1010' activates the first and
third cores and deactivate all other cores.
If the option is not set, all cores can be used. If the option
is set only for some cores, the other cores are not used.
--gpu <index> Use the specified device for replay, where index
is the zero-based index to the array of physical devices
returned by vkEnumeratePhysicalDevices. Replay may fail
Expand Down
5 changes: 5 additions & 0 deletions android/scripts/gfxrecon.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ def CreateReplayParser():
parser.add_argument('--log-file', metavar='DEVICE_FILE', help='Write log messages to a file at the specified path instead of logcat (forwarded to replay tool)')
parser.add_argument('--pause-frame', metavar='N', help='Pause after replaying frame number N (forwarded to replay tool)')
parser.add_argument('--paused', action='store_true', default=False, help='Pause after replaying the first frame (same as "--pause-frame 1"; forwarded to replay tool)')
parser.add_argument('--cpu-mask', metavar='binary_mask', help='Set of CPU cores used by the replayer. `binary-mask` is a succession of "0" and "1" that specifies used/unused cores. For example "1010" activates the first and third cores and deactivate all other cores. If the option is not set, all cores can be used. If the option is set only for some cores, the other cores are not used. (forwarded to replay tool)')
parser.add_argument('--screenshot-all', action='store_true', default=False, help='Generate screenshots for all frames. When this option is specified, --screenshots is ignored (forwarded to replay tool)')
parser.add_argument('--screenshots', metavar='RANGES', help='Generate screenshots for the specified frames. Target frames are specified as a comma separated list of frame ranges. A frame range can be specified as a single value, to specify a single frame, or as two hyphenated values, to specify the first and last frames to process. Frame ranges should be specified in ascending order and cannot overlap. Note that frame numbering is 1-based (i.e. the first frame is frame 1). Example: 200,301-305 will generate six screenshots (forwarded to replay tool)')
parser.add_argument('--screenshot-format', metavar='FORMAT', choices=['bmp', 'png'], help='Image file format to use for screenshot generation. Available formats are: bmp, png (forwarded to replay tool)')
Expand Down Expand Up @@ -142,6 +143,10 @@ def MakeExtrasString(args):
if args.paused:
arg_list.append('--paused')

if args.cpu_mask:
arg_list.append('--cpu-mask')
arg_list.append('{}'.format(args.cpu_mask))

if args.screenshot_all:
arg_list.append('--screenshot-all')
elif args.screenshots:
Expand Down
1 change: 1 addition & 0 deletions framework/decode/replay_options.h
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ struct ReplayOptions
bool force_windowed_origin{ false };
int32_t window_topleft_x{ 0 };
int32_t window_topleft_y{ 0 };
std::string cpu_mask;
int32_t override_gpu_index{ -1 };
std::string capture_filename;
bool enable_print_block_info{ false };
Expand Down
91 changes: 91 additions & 0 deletions framework/util/platform.h
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,10 @@
#include <sys/system_properties.h>
#endif

#if !defined(WIN32)
#include <sched.h>
#endif

GFXRECON_BEGIN_NAMESPACE(gfxrecon)
GFXRECON_BEGIN_NAMESPACE(util)
GFXRECON_BEGIN_NAMESPACE(platform)
Expand Down Expand Up @@ -260,6 +264,52 @@ inline int GetSystemLastErrorCode()
return GetLastError();
}

inline std::string GetCpuAffinity()
{
DWORD_PTR process_mask;
DWORD_PTR system_mask;
if (!GetProcessAffinityMask(GetCurrentProcess(), &process_mask, &system_mask))
{
return "";
}

DWORD_PTR mask = (process_mask & system_mask);

std::string affinity;
while (mask)
{
affinity += (mask & 1) ? "1" : "0";
mask >>= 1;
}

while (affinity.back() == '0')
{
affinity.pop_back();
}

return affinity;
}

inline bool SetCpuAffinity(const std::string& affinity)
{
DWORD_PTR mask = 0;
for (unsigned i = 0; i < affinity.size(); i++)
{
if (affinity[i] == '1')
{
mask |= 1;
}
else if (affinity[i] != '0')
{
return false;
}

mask <<= 1;
}

return (SetProcessAffinityMask(GetCurrentProcess(), mask) != 0);
}

#else // !defined(WIN32)

// Error value indicating string was truncated
Expand Down Expand Up @@ -568,6 +618,47 @@ inline int GetSystemLastErrorCode()
return errno;
}

inline std::string GetCpuAffinity()
{
cpu_set_t mask;
if (sched_getaffinity(0, sizeof(mask), &mask))
{
return "";
}

std::string affinity;
for (unsigned i = 0; i < sizeof(mask) / CPU_ALLOC_SIZE(1); i++)
{
affinity += CPU_ISSET(i, &mask) ? "1" : "0";
}

while (affinity.back() == '0')
{
affinity.pop_back();
}

return affinity;
}

static bool SetCpuAffinity(const std::string& affinity)
{
cpu_set_t mask;
CPU_ZERO(&mask);
for (unsigned i = 0; i < affinity.size(); i++)
{
if (affinity[i] == '1')
{
CPU_SET(i, &mask);
}
else if (affinity[i] != '0')
{
return false;
}
}

return (sched_setaffinity(0, sizeof(mask), &mask) == 0);
}

#endif // WIN32

inline size_t GetAlignedSize(size_t size, size_t align_to)
Expand Down
12 changes: 10 additions & 2 deletions tools/replay/replay_settings.h
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ const char kOptions[] =
"resources,--dump-resources-dump-all-image-subresources,--dump-resources-dump-raw-images,--dump-resources-dump-"
"separate-alpha,--pbi-all,--preload-measurement-range, --add-new-pipeline-caches";
const char kArguments[] =
"--log-level,--log-file,--gpu,--gpu-group,--pause-frame,--wsi,--surface-index,-m|--memory-translation,"
"--log-level,--log-file,--cpu-mask,--gpu,--gpu-group,--pause-frame,--wsi,--surface-index,-m|--memory-translation,"
"--replace-shaders,--screenshots,--denied-messages,--allowed-messages,--screenshot-format,--"
"screenshot-dir,--screenshot-prefix,--screenshot-size,--screenshot-scale,--mfr|--measurement-frame-range,--fw|--"
"force-windowed,--fwo|--force-windowed-origin,--batching-memory-usage,--measurement-file,--swapchain,--sgfs|--skip-"
Expand All @@ -59,7 +59,8 @@ static void PrintUsage(const char* exe_name)

GFXRECON_WRITE_CONSOLE("\n%s - A tool to replay GFXReconstruct capture files.\n", app_name.c_str());
GFXRECON_WRITE_CONSOLE("Usage:");
GFXRECON_WRITE_CONSOLE(" %s\t[-h | --help] [--version] [--gpu <index>] [--gpu-group <index>]", app_name.c_str());
GFXRECON_WRITE_CONSOLE(" %s\t[-h | --help] [--version]", app_name.c_str());
GFXRECON_WRITE_CONSOLE("\t\t\t[--cpu-mask <binary-mask>] [--gpu <index>] [--gpu-group <index>]");
GFXRECON_WRITE_CONSOLE("\t\t\t[--pause-frame <N>] [--paused] [--sync] [--screenshot-all]");
GFXRECON_WRITE_CONSOLE("\t\t\t[--screenshots <N1(-N2),...>] [--screenshot-format <format>]");
GFXRECON_WRITE_CONSOLE("\t\t\t[--screenshot-dir <dir>] [--screenshot-prefix <file-prefix>]");
Expand Down Expand Up @@ -159,6 +160,13 @@ static void PrintUsage(const char* exe_name)
GFXRECON_WRITE_CONSOLE(" --validate\t\tEnable the Khronos Vulkan validation layer when replaying a");
GFXRECON_WRITE_CONSOLE(" \t\tVulkan capture or the Direct3D debug layer when replaying a");
GFXRECON_WRITE_CONSOLE(" \t\tDirect3D 12 capture.");
GFXRECON_WRITE_CONSOLE(" --cpu-mask <binary-mask>");
GFXRECON_WRITE_CONSOLE(" \t\tSet of CPU cores used by the replayer.");
GFXRECON_WRITE_CONSOLE(" \t\t`binary-mask` is a succession of '0' and '1' that specifies");
GFXRECON_WRITE_CONSOLE(" \t\tused/unused cores. For example '1010' activates the first and");
GFXRECON_WRITE_CONSOLE(" \t\tthird cores and deactivate all other cores.");
GFXRECON_WRITE_CONSOLE(" \t\tIf the option is not set, all cores can be used. If the option");
GFXRECON_WRITE_CONSOLE(" \t\tis set only for some cores, the other cores are not used.");
GFXRECON_WRITE_CONSOLE(" --gpu <index>\t\tUse the specified device for replay, where index");
GFXRECON_WRITE_CONSOLE(" \t\tis the zero-based index to the array of physical devices");
GFXRECON_WRITE_CONSOLE(" \t\treturned by vkEnumeratePhysicalDevices or IDXGIFactory1::EnumAdapters1.");
Expand Down
15 changes: 15 additions & 0 deletions tools/tool_settings.h
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ const char kLogLevelArgument[] = "--log-level";
const char kLogFileArgument[] = "--log-file";
const char kLogDebugView[] = "--log-debugview";
const char kNoDebugPopup[] = "--no-debug-popup";
const char kCpuMaskArgument[] = "--cpu-mask";
const char kOverrideGpuArgument[] = "--gpu";
const char kOverrideGpuGroupArgument[] = "--gpu-group";
const char kPausedOption[] = "--paused";
Expand Down Expand Up @@ -934,6 +935,20 @@ static void GetReplayOptions(gfxrecon::decode::ReplayOptions& options,
options.num_pipeline_creation_jobs = std::stoi(arg_parser.GetArgumentValue(kNumPipelineCreationJobs));
}

options.cpu_mask = arg_parser.GetArgumentValue(kCpuMaskArgument);
if (!options.cpu_mask.empty())
{
if (gfxrecon::util::platform::SetCpuAffinity(options.cpu_mask))
{
GFXRECON_LOG_INFO("CPU mask successfully set: %s", gfxrecon::util::platform::GetCpuAffinity().c_str());
}
else
{
GFXRECON_LOG_ERROR("Failed to set CPU mask: %s", options.cpu_mask.c_str());
GFXRECON_LOG_ERROR("Resuming with CPU mask: %s", gfxrecon::util::platform::GetCpuAffinity().c_str());
}
}

const auto& override_gpu = arg_parser.GetArgumentValue(kOverrideGpuArgument);
if (!override_gpu.empty())
{
Expand Down

0 comments on commit 12e7421

Please sign in to comment.