Skip to content

Latest commit

 

History

History
116 lines (75 loc) · 4.72 KB

README.md

File metadata and controls

116 lines (75 loc) · 4.72 KB

CUDA rules for Bazel

This repository contains Starlark implementation of CUDA rules in Bazel.

These rules provide some macros and rules that make it easier to build CUDA with Bazel.

Getting Started

Add the following to your WORKSPACE file and replace the placeholders with actual values.

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
http_archive(
    name = "rules_cuda",
    sha256 = "{sha256_to_replace}",
    strip_prefix = "rules_cuda-{git_commit_hash}",
    urls = ["https://github.com/bazel-contrib/rules_cuda/archive/{git_commit_hash}.tar.gz"],
)
load("@rules_cuda//cuda:repositories.bzl", "register_detected_cuda_toolchains", "rules_cuda_dependencies")
rules_cuda_dependencies()
register_detected_cuda_toolchains()

NOTE: the use of register_detected_cuda_toolchains depends on the environment variable CUDA_PATH. You must also ensure the host compiler is available. On windows, this means that you will also need to set the environment variable BAZEL_VC properly.

detect_cuda_toolkit and detect_clang determains how the toolchains are detected.

Rules

  • cuda_library: Can be used to compile and create static library for CUDA kernel code. The resulting targets can be consumed by C/C++ Rules.
  • cuda_objects: If you don't understand what device link means, you must never use it. This rule produce incomplete object files that can only be consumed by cuda_library. It is created for relocatable device code and device link time optimization source files.

Flags

Some flags are defined in cuda/BUILD.bazel. To use them, for example:

bazel build --@rules_cuda//cuda:archs=compute_61:compute_61,sm_61

In .bazelrc file, you can define shortcut alias for the flag, for example:

# Convenient flag shortcuts.
build --flag_alias=cuda_archs=@rules_cuda//cuda:archs

and then you can use it as following:

bazel build --cuda_archs=compute_61:compute_61,sm_61

Available flags

  • @rules_cuda//cuda:enable

    Enable or disable all rules_cuda related rules. When disabled, the detected cuda toolchains will also be disabled to avoid potential human error. By default, rules_cuda rules are enabled. See examples/if_cuda for how to support both cuda-enabled and cuda-free builds.

  • @rules_cuda//cuda:archs

    Select the cuda archs to support. See cuda_archs specification DSL grammar.

  • @rules_cuda//cuda:compiler

    Select the cuda compiler, available options are nvcc or clang

  • @rules_cuda//cuda:copts

    Add the copts to all cuda compile actions.

  • @rules_cuda//cuda:host_copts

    Add the copts to the host compiler.

  • @rules_cuda//cuda:runtime

    Set the default cudart to link, for example, --@rules_cuda//cuda:runtime=@local_cuda//:cuda_runtime_static link the static cuda runtime.

  • --features=cuda_device_debug

    Sets nvcc flags to enable debug information in device code. Currently ignored for clang, where --compilation_mode=debug applies to both host and device code.

Examples

Checkout the examples to see if it fits your needs.

See examples for basic usage.

See rules_cuda_examples for extended real world projects.

Known issue

Sometimes the following error occurs:

cc1plus: fatal error: /tmp/tmpxft_00000002_00000019-2.cpp: No such file or directory

The problem is caused by nvcc use PID to determine temporary file name, and with --spawn_strategy linux-sandbox which is the default strategy on Linux, the PIDs nvcc sees are all very small numbers, say 2~4 due to sandboxing. linux-sandbox is not hermetic because it mounts root into the sandbox, thus, /tmp is shared between sandboxes, which is causing name conflict under high parallelism. Similar problem has been reported at nvidia forums.

To avoid it:

  • Use --spawn_strategy local should eliminate the case because it will let nvcc sees the true PIDs.
  • Use --experimental_use_hermetic_linux_sandbox should eliminate the case because it will avoid the sharing of /tmp.
  • Add -objtemp option to the command should reduce the case from happening.