Skip to content

Commit 4b12312

Browse files
authored
Move task_arena_waiting/task_group_interop to supported RFCs (#1864)
1 parent 3df32aa commit 4b12312

File tree

3 files changed

+97
-25
lines changed

3 files changed

+97
-25
lines changed

rfcs/proposed/task_arena_waiting/readme.md

Lines changed: 2 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -113,23 +113,7 @@ ta.enqueue([]{ foo(); }, tg); // corresponds to: ta.enqueue(tg.defer([]{ foo();
113113
ta.wait_for(tg); // corresponds to: ta.execute([&tg]{ tg.wait(); });
114114
```
115115
116-
The example code to split work across NUMA-bound task arenas could then look like this (assuming also
117-
a special function that creates and initializes a vector of arenas):
118-
```cpp
119-
std::vector<tbb::task_arena> numa_arenas =
120-
initialize_constrained_arenas(/*some arguments*/);
121-
std::vector<tbb::task_group> task_groups(numa_arenas.size());
122-
123-
for(unsigned j = 0; j < numa_arenas.size(); j++) {
124-
numa_arenas[j].enqueue( (){/*some parallel stuff*/}, task_groups[j] );
125-
}
126-
127-
for(unsigned j = 0; j < numa_arenas.size(); j++) {
128-
numa_arenas[j].wait_for( task_groups[j] );
129-
}
130-
```
131-
132-
See [Improve interoperability with task groups](task_group_interop.md) for more details.
116+
This part of the proposal [is supported](../../supported/task_arena_waiting/readme.md) since oneTBB 2022.3.
133117
134118
### 2. Reconsider waiting for all tasks
135119
@@ -182,6 +166,6 @@ ta.block_with_progress_delegation([]{ std::this_thread::sleep_for(100ms); });
182166
183167
## Open Questions
184168
169+
- Implementation feasibility for (2) and (3) needs to be explored
185170
- API names and semantic details need to be further elaborated
186171
- Whether/how work isolation is supported needs to be decided
187-
- Implementation feasibility for (2) and (3) needs to be explored
Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
# Waiting in a task_arena
2+
3+
For more details on waiting for work in a task arena, see
4+
[the corresponding RFC proposal](../../proposed/task_arena_waiting/readme.md).
5+
This document covers parts that have been implemented in oneTBB.
6+
7+
## Motivation
8+
9+
Task arenas in oneTBB are the places for threads to share and execute tasks.
10+
A `task_arena` instance represents a configurable execution context for parallel work.
11+
12+
There are two primary ways to submit work to an arena: the `execute` and `enqueue` functions.
13+
Both take a callable object and run it in the context of the arena. The callable object
14+
might start more parallel work in the arena by invoking a oneTBB algorithm, running a flow graph,
15+
or submitting work into a task group.
16+
`execute` is a blocking call: the calling thread does not return until the callable object
17+
completes. `enqueue` is a “fire-and-forget” call: the calling thread submits the callable
18+
object as a task and returns immediately, providing no way to synchronize with the completion
19+
of the task.
20+
21+
Therefore, there was no convenient way to submit work for asynchronous execution **and** later wait
22+
for completion of that work.
23+
24+
### Earlier solution: combining with a task group
25+
26+
In oneTBB, asynchronous execution is supported by `task_group` and the flow graph API; both allow
27+
submitting a job and waiting for its completion later.
28+
Notably, both require calling `wait`/`wait_for_all` to ensure that
29+
the work will be done. The `task_arena::enqueue`, on the other hand, being "fire-and-forget", enforces
30+
availability of another thread in the arena to execute the task (so-called *mandatory concurrency*).
31+
32+
So, a reasonable solution for the described use cases seems to combine a `task_arena` with a `task_group`.
33+
However, it was notoriously non-trivial to do right. For example, the following "naive" attempt is subtly
34+
incorrect:
35+
```cpp
36+
tbb::task_arena ta{/*args*/};
37+
tbb::task_group tg;
38+
ta.enqueue([&tg]{ tg.run([]{ foo(); }); });
39+
bar();
40+
ta.execute([&tg]{ tg.wait(); });
41+
```
42+
The problem is that `enqueue` submits a task that calls `tg.run` to add `[]{ foo(); }` to the task group,
43+
but it is unknown if that task was actually executed prior to `tg.wait`. Simply put,
44+
the task group might yet be empty, in which case `tg.wait` exits prematurely.
45+
46+
To avoid that, `execute` can be used instead of `enqueue`, but then the mentioned
47+
thread availability guarantee is lost. The approach with `execute` is shown in the
48+
[oneTBB Developer Guide](https://oneapi-src.github.io/oneTBB/main/tbb_userguide/Guiding_Task_Scheduler_Execution.html)
49+
as an example to split the work across several NUMA domains. The example utilizes the fork-join
50+
synchronization pattern to ensure that the work is complete
51+
in all the arenas. It also illustrates that the problem stated in this proposal is relevant.
52+
53+
A better way of using these classes together, however, is the following:
54+
```cpp
55+
tbb::task_arena ta{/*args*/};
56+
tbb::task_group tg;
57+
ta.enqueue(tg.defer([]{ foo(); }));
58+
bar();
59+
ta.execute([&tg]{ tg.wait(); });
60+
```
61+
In this case, the task group "registers" a deferred task to run `foo()`, which is then enqueued
62+
to the task arena. The task is added by the calling thread, so we can be sure that `tg.wait` will not
63+
return until the task completes.
64+
65+
## Implemented improvements
66+
67+
To address extra complexity and verbosity of using together `task_arena` and `task_group`, the `enqueue` method
68+
of `task_arena` is overloaded to take `task_group` as the second argument, and a new method is added to wait
69+
for a task group:
70+
```cpp
71+
ta.enqueue([]{ foo(); }, tg); // corresponds to: ta.enqueue(tg.defer([]{ foo(); }));
72+
ta.wait_for(tg); // corresponds to: ta.execute([&tg]{ tg.wait(); });
73+
```
74+
75+
This API has been implemented since oneTBB 2022.3.
76+
See [Improve interoperability with task groups](task_group_interop.md) for more details.
77+
78+
The example code to split work across NUMA-bound task arenas can now look like this (assuming also
79+
a special function that creates and initializes a vector of arenas):
80+
```cpp
81+
std::vector<tbb::task_arena> numa_arenas =
82+
initialize_constrained_arenas(/*some arguments*/);
83+
std::vector<tbb::task_group> task_groups(numa_arenas.size());
84+
85+
for(unsigned j = 0; j < numa_arenas.size(); j++) {
86+
numa_arenas[j].enqueue( (){/*some parallel stuff*/}, task_groups[j] );
87+
}
88+
89+
for(unsigned j = 0; j < numa_arenas.size(); j++) {
90+
numa_arenas[j].wait_for( task_groups[j] );
91+
}
92+
```

rfcs/proposed/task_arena_waiting/task_group_interop.md renamed to rfcs/supported/task_arena_waiting/task_group_interop.md

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ namespace oneapi::tbb {
3030
} // namespace oneapi::tbb
3131
```
3232

33-
## Design discussion
33+
## Design
3434

3535
### Enqueue a function as a part of a task group
3636

@@ -71,6 +71,8 @@ class (see `oneapi/tbb/task_group.h`) and directly call the `execute` library en
7171
There is no need to have a similar function in the `this_task_arena` namespace, as it would be
7272
no different from calling `tg.wait()`.
7373
74+
## Future Questions
75+
7476
### Should `execute` be extended as well?
7577
7678
Another method, `task_arena::execute` appear similar to `enqueue` in the sense that it also takes a callable
@@ -115,9 +117,3 @@ We can consider the following options for providing isolation in `task_arena::wa
115117
- keep the `isolated_task_group` class and support it in the proposed `task_arena` extensions;
116118
- somehow extend the `task_group` class to optionally support work isolation (might require incompatible changes);
117119
- add an isolation tag (automatically or on demand) only when a `task_group` is used with `task_arena`.
118-
119-
## Open Questions
120-
121-
- Is there any value in implementing this proposal first as experimental/preview API?
122-
- Should a new overload for `execute` be added, that takes a task group argument?
123-
- Whether/how work isolation is supported needs to be decided

0 commit comments

Comments
 (0)