From 91ca50b55553d0907776d0ed260540f150da3957 Mon Sep 17 00:00:00 2001 From: Aleksei Fedotov Date: Thu, 23 Jan 2025 15:06:33 +0100 Subject: [PATCH] Add sub-RFC for increased availability of NUMA API (#1545) --- .../tbbbind-link-static-hwloc.org | 119 ++++++++++++++++++ 1 file changed, 119 insertions(+) create mode 100755 rfcs/proposed/numa_support/tbbbind-link-static-hwloc.org diff --git a/rfcs/proposed/numa_support/tbbbind-link-static-hwloc.org b/rfcs/proposed/numa_support/tbbbind-link-static-hwloc.org new file mode 100755 index 0000000000..d108ac1283 --- /dev/null +++ b/rfcs/proposed/numa_support/tbbbind-link-static-hwloc.org @@ -0,0 +1,119 @@ +# -*- fill-column: 80; -*- + +#+title: Link ~tbbbind~ with Static HWLOC for NUMA API predictability + +*Note:* This document is a sub-RFC of the [[file:README.md][umbrella RFC about improving NUMA +support]]. Specifically, the "Increased availability of NUMA support" section. + +* Introduction +oneTBB has a soft dependency on several variants of ~tbbbind~, which the library +loads during the initialization stage. Each ~tbbbind~, in turn, has a hard +dependency on a specific version of the HWLOC library [1, 2]. The soft +dependency means that the library continues the execution even if the system +loader fails to resolve the hard dependency on HWLOC for ~tbbbind~. In this +case, oneTBB does not discover the hardware topology. Instead, it defaults to +viewing all CPU cores as uniform, consistent with TBB behavior when NUMA +constraints are not used. As a result, the following code returns the irrelevant +values that do not reflect the actual topology: + +#+begin_src C++ +std::vector numa_nodes = oneapi::tbb::info::numa_nodes(); +std::vector core_types = oneapi::tbb::info::core_types(); +#+end_src + +This lack of valid HW topology, caused by the absence of a third-party library, +is the major problem with the current oneTBB behavior. The problem lies in the +lack of diagnostics making it difficult for developers to detect. As a result, +the code continues to run but fails to use NUMA as intended. + +Dependency on a shared HWLOC library has the following benefits: +1. Code reuse with all of the positive consequences out of this, including + relying on the same code that has been tested and debugged, allowing the OS + to share it among different processes, which consequently improves on cache + locality and memory footprint. That's the primary purpose of shared + libraries. +2. A drop-in replacement. Users are able to use their own version of HWLOC + without recompilation of oneTBB. This specific version of HWLOC could include + a hotfix to support a particular and/or new hardware that a customer has, but + whose support is not yet upstreamed to HWLOC project. It is also possible + that such support won't be upstreamed at all if that hardware is not going to + be available for massive users. It could also be a development version of + HWLOC that someone wants to test on their systems first. Of course, they can + do it with the static version as well, but that's more cumbersome as it + requires recompilation of every dependent component. + +The only disadvantage from depending on HWLOC library dynamically is that the +developers that use oneTBB's NUMA support API need to make sure the library is +available and can be found by oneTBB. Depending on the distribution model of a +developer's code, this is achieved either by: +1. Asking the end user to have necessary version of a dependency pre-installed. +2. Bundling necessary HWLOC version together with other pieces of a product + release. + +However, the requirement to fulfill one of the above steps for the NUMA API to +start paying off may be considered as an incovenience and, what is more +important, it is not always obvious that one of these steps is needed. +Especially, due to silent behavior in case HWLOC library cannot be found in the +environment. + +The proposal is to reduce the effect of the disadvantage of relying on a dynamic +HWLOC library. The improvements involve statically linking HWLOC with one of the +~tbbbind~ libraries distributed together with oneTBB. At the same time, you +retain the flexibility to specify different version of HWLOC library if needed. + +Since HWLOC 1.x is an older version and modern operating systems install HWLOC +2.x by default, the probability of users being restricted to HWLOC 1.x is +relatively small. Thus, we can reuse the filename of the ~tbbbind~ library +linked to HWLOC 1.x for the library linked against a static HWLOC 2.x. + +* Proposal +1. Replace the dynamic link of ~tbbbind~ library currently linked + against HWLOC 1.x with a link to a static HWLOC library version 2.x. +2. Add loading of that ~tbbbind~ variant as the last attempt to resolve the + dependency on functionality provided by the ~tbbbind~ layer. +3. Update the oneTBB documentation, including [[https://oneapi-src.github.io/oneTBB/search.html?q=tbb%3A%3Ainfo][these pages]], to + detail the steps for identifying which ~tbbbind~ is being used. + +** Advantages +1. The proposed behavior introduces a fallback mechanism for resolving the HWLOC + library dependency when it is not in the environment, while still preferring + user-provided versions. As a result, the problematic oneTBB API usage works + as expected, returning an enumerated list of actual NUMA nodes and core types + on the system the code is running on, provided that the loaded HWLOC library + works on that system and that an application properly distributes all + binaries of oneTBB, sets the environment so that the necessary variant of + ~tbbbind~ library can be found and loaded. +2. Dropping support for HWLOC 1.x, does not introduce an additional ~tbbbind~ + variant while maintaining support for widely used versions of HWLOC. + +** Disadvantages +By default, there is still no diagnostics if you fail to correctly setup an +environment with your version of HWLOC. Although, specifying the ~TBB_VERSION=1~ +environment variable helps identify configuration issues quickly. + +* Alternative Handling for Missing System Topology +The other behavior in case HWLOC library cannot be found is to be more explicit +about the problem of a missing component and to either issue a warning or to +refuse working requiring one of the ~tbbbind~ variant to be loaded (e.g., throw +an exception). + +Comparing these alternative approaches to the one proposed. +** Common Advantages +- Explicitly indicates that the functionality being used does not work, instead + of failing silently. +- Avoids the need to distribute an additional variant of ~tbbbind~ library. + +** Common Disadvantages +- Requires additional step from the user side to resolve the problem. In other + words, it does not provide complete solution to the problem. + +*** Disadvantages of Issuing a Warning +- The warning may be unnoticed, especially if standard streams are closed. + +*** Disadvantages of Throwing an Exception +- May break existing code that does not expect an exception to be thrown. +- Requires introduction of an additional exception hierarchy. + +* References +1. [[https://www.open-mpi.org/projects/hwloc/][HWLOC project main page]] +2. [[https://github.com/open-mpi/hwloc][HWLOC project repository on GitHub]]