-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add sub-RFC for increased availability of NUMA API #1545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 10 commits
fdc26b4
ce6746d
258b82c
90bfaba
58a441f
5e8b79e
d0bf373
a10984c
35d7f55
81021be
fd3661e
c9d2572
f03f669
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,125 @@ | ||||||
| # -*- fill-column: 80; -*- | ||||||
|
|
||||||
| #+title: Link ~tbbbind~ with static HWLOC to improve predictability of NUMA support API | ||||||
|
|
||||||
| *Note:* This is a sub-RFC of the https://github.com/oneapi-src/oneTBB/pull/1535. | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| Specifically, its section about "Increased availability of NUMA support". | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
|
|
||||||
| * Introduction | ||||||
| oneTBB has a soft dependency on several variants of ~tbbbind~, which are loaded | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| by the library as part of its initialization stage. In turn, each ~tbbbind~ has | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| a hard dependency on a concrete version of the HWLOC library [1, 2]. The soft | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| dependency of oneTBB on ~tbbbind~ allows the library to continue its execution | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| even if the system loader is unable to resolve the hard dependency on HWLOC for | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| ~tbbbind~. In this case, the HW topology is not discovered and the machine is | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| seen as if all CPU cores were uniform, which is the default TBB behavior when | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| NUMA constraints are not used. Thus, the following code returns the values that | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| do not reflect the real topology and do not matter: | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
|
|
||||||
| #+begin_src C++ | ||||||
| std::vector<oneapi::tbb::numa_node_id> numa_nodes = oneapi::tbb::info::numa_nodes(); | ||||||
| std::vector<oneapi::tbb::core_type_id> core_types = oneapi::tbb::info::core_types(); | ||||||
| #+end_src | ||||||
|
|
||||||
| This lack of valid HW topology data due to absence of a third party library is | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| the major problem with the current oneTBB behavior. There is no diagnostics for | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| the issue, which likely makes it unnoticeable by developers, and the code that | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| uses oneTBB NUMA support facilities continues running but does not use NUMA as | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| intended. | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
|
|
||||||
| Having a dependency on a shared HWLOC library has advantages: | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| 1. Code reuse with all of the positive consequences out of this, including | ||||||
| relying on the same code that has been tested and debugged, allowing the OS | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. but in fact most of Linux OSes has obsolete hwloc versions and relying on it does not provide benefits.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, I did not know that. I will consider this in the future changes.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd rewrite it to smth like this:
|
||||||
| to share it among different processes, which consequently improves on cache | ||||||
| locality and memory footprint. That's the primary purpose of shared | ||||||
| libraries. | ||||||
| 2. A drop-in replacement. Users are able to use their own version of HWLOC | ||||||
| without recompilation of oneTBB. This specific version of HWLOC could include | ||||||
| a hotfix to support a particular and/or new hardware that a customer has, but | ||||||
| whose support is not yet upstreamed to HWLOC project. It is also possible | ||||||
| that such support won't be upstreamed at all if that hardware is not going to | ||||||
| be available for massive users. It could also be a development version of | ||||||
| HWLOC that someone wants to test on their systems first. Of course, they can | ||||||
| do it with the static version as well, but that's more cumbersome as it | ||||||
| requires recompilation of every dependent component. | ||||||
|
|
||||||
| The only disadvantage from depending on HWLOC library dynamically is that the | ||||||
aleksei-fedotov marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
| developers that use oneTBB's NUMA support API need to make sure the library is | ||||||
| available and can be found by oneTBB. Depending on the distribution model of a | ||||||
| developer's code, this is achieved either by: | ||||||
| 1. Asking the end user to have necessary version of a dependency pre-installed. | ||||||
| 2. Bundling necessary HWLOC version together with other pieces of a product | ||||||
| release. | ||||||
|
|
||||||
| However, the requirement to fulfill one of the above steps for the NUMA API to | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. However, the need to complete one of the above steps for the NUMA API to function effectively may be seen as inconvenient. More importantly, it is not always immediately clear that these steps are required. Especially, die to the silent fallback behavior when the HWLOC library is not found in the environment. |
||||||
| start paying off may be considered as an incovenience and, what is more | ||||||
| important, it is not always obvious that one of these steps is needed. | ||||||
| Especially, due to silent behavior in case HWLOC library cannot be found in the | ||||||
| environment. | ||||||
|
|
||||||
| This proposal suggests an improvement to reduce the effect of the disadvantage | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| being dependent on a dynamic version of HWLOC library by having it linked | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| statically with one of the ~tbbbind~ libraries that are distributed together | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| with oneTBB, yet leaving possibility to specify another version of HWLOC library | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| if users see the need. | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
|
|
||||||
| Since HWLOC 1.x is an old version of HWLOC and modern versions of operating | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| systems install HWLOC 2.x by default, the probability of someone who is | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| constrained by using only HWLOC 1.x on their system is relatively small. Thus, | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| the filename of the ~tbbbind~ library that is linked against HWLOC 1.x can be | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| re-used for the library that is linked against static HWLOC version 2.x. | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
|
|
||||||
| * Proposal | ||||||
| 1. Replace the dynamic link of ~tbbbind~ library which is currently linked | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| against HWLOC 1.x with the link to a static HWLOC library version 2.x. | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| 2. Add loading of that ~tbbbind~ variant as the last attempt to resolve the | ||||||
| dependency on functionality provided by ~tbbbind~ layer. | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| 3. Update the oneTBB documentation considering [[https://oneapi-src.github.io/oneTBB/search.html?q=tbb%3A%3Ainfo][these documentation pages]] to | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| include steps determining the variant of ~tbbbind~ being used. | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
|
|
||||||
| ** Advantages | ||||||
| 1. The proposed behavior allows having a mechanism for resolving a dependency on | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| HWLOC library in case it cannot be found in the environment, while still | ||||||
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| preferring user-provided version of HWLOC. As a result, the problematic use of | ||||||
|
||||||
| preferring user-provided version of HWLOC. As a result, the problematic use of | |
| preferring user-provided versions. As a result, the problematic oneTBB API usage |
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| oneTBB API mentioned above should work as expected, returning enumerated list | |
| works as expected, returning an enumerated list |
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
of actual NUMA nodes and core types, provided that:
- The loaded HWLOC library is compatible with the system.
- The application properly distributes all oneTBB binaries and configures the environment to locate and load the required tbbbind library variant.
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
aleksei-fedotov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| own version of HWLOC library correctly. Although, specifying ~TBB_VERSION=1~ | |
| version of HWLOC. Although, specifying the ~TBB_VERSION=1~ |
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| envar will help identifying an issue with setup of environment pretty quickly. | |
| environment variable helps identify configuration issues quickly. |
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * Alternative handling of inability to parse system topology | |
| * Alternative Handling for Missing System Topology |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative approach to handle the absence of the HWLOC library is to adopt a more explicit response:
- Issue a warning about the missing component.
- Require one of the tbbbind variants to be loaded by refusing to work or throwing an exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be a heading?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No necessarily.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Explicitly tells that the functionality being used is not going to work | |
| - Explicitly indicates that the functionality being used does not work, |
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| instead of just being silent. | |
| instead of failing silently. |
Uh oh!
There was an error while loading. Please reload this page.