部分和任务openmp之间的区别
Posted
技术标签:
【中文标题】部分和任务openmp之间的区别【英文标题】:Difference between section and task openmp 【发布时间】:2012-11-27 03:21:02 【问题描述】:OpenMP 之间有什么区别:
#pragma omp parallel sections
#pragma omp section
fct1();
#pragma omp section
fct2();
和:
#pragma omp parallel
#pragma omp single
#pragma omp task
fct1();
#pragma omp task
fct2();
我不确定第二个代码是否正确...
【问题讨论】:
除了在两个语句的末尾缺少;
之外,第二个代码是正确的。
【参考方案1】:
任务和部分之间的区别在于代码执行的时间范围。部分包含在 sections
构造中,并且(除非指定了 nowait
子句)线程不会离开它,直到所有部分都已执行:
[ sections ]
Thread 0: -------< section 1 >---->*------
Thread 1: -------< section 2 >*------
Thread 2: ------------------------>*------
... *
Thread N-1: ---------------------->*------
这里N
线程遇到了一个sections
结构,其中包含两个部分,第二个部分比第一个部分花费更多时间。前两个线程各执行一个部分。其他N-2
线程只是在section 构造末尾的隐式屏障处等待(此处显示为*
)。
任务尽可能在所谓的任务调度点排队和执行。在某些情况下,可以允许运行时在线程之间移动任务,即使在它们的生命周期中也是如此。这样的任务被称为 untied,一个 untied 任务可能开始在一个线程中执行,然后在某个调度点它可能被运行时迁移到另一个线程。
不过,任务和部分在许多方面还是相似的。例如,以下两个代码片段实现了基本相同的结果:
// sections
...
#pragma omp sections
#pragma omp section
foo();
#pragma omp section
bar();
...
// tasks
...
#pragma omp single nowait
#pragma omp task
foo();
#pragma omp task
bar();
#pragma omp taskwait
...
taskwait
的工作方式与barrier
非常相似,但对于任务 - 它确保当前执行流程将暂停,直到所有排队的任务都已执行。它是一个调度点,即它允许线程处理任务。需要single
构造,以便仅由一个线程创建任务。如果没有single
构造,每个任务将被创建num_threads
次,这可能不是人们想要的。 nowait
构造中的 nowait
子句指示其他线程不要等到执行 single
构造(即删除 single
构造末尾的隐式屏障)。于是他们立即点击taskwait
并开始处理任务。
taskwait
是一个明确的调度点,这里为了清楚起见。还有隐式调度点,尤其是在屏障同步内部,无论是显式的还是隐式的。因此,上面的代码也可以简单地写成:
// tasks
...
#pragma omp single
#pragma omp task
foo();
#pragma omp task
bar();
...
如果存在三个线程,以下是可能发生的一种情况:
+--+-->[ task queue ]--+
| | |
| | +-----------+
| | |
Thread 0: --< single >-| v |-----
Thread 1: -------->|< foo() >|-----
Thread 2: -------->|< bar() >|-----
在| ... |
内显示的是调度点的操作(taskwait
指令或隐式屏障)。基本上线程1
和2
暂停他们当时正在做的事情并开始处理队列中的任务。处理完所有任务后,线程将恢复其正常执行流程。请注意,线程1
和2
可能在线程0
退出single
构造之前到达调度点,因此左侧|
不需要对齐(这在上图中表示)。
线程1
也可能会在其他线程能够请求任务之前完成处理foo()
任务并请求另一个任务。所以foo()
和bar()
都可能被同一个线程执行:
+--+-->[ task queue ]--+
| | |
| | +------------+
| | |
Thread 0: --< single >-| v |---
Thread 1: --------->|< foo() >< bar() >|---
Thread 2: --------------------->| |---
如果线程 2 来得太晚,单挑出的线程也有可能执行第二个任务:
+--+-->[ task queue ]--+
| | |
| | +------------+
| | |
Thread 0: --< single >-| v < bar() >|---
Thread 1: --------->|< foo() > |---
Thread 2: ----------------->| |---
在某些情况下,编译器或 OpenMP 运行时甚至可能完全绕过任务队列并串行执行任务:
Thread 0: --< single: foo(); bar() >*---
Thread 1: ------------------------->*---
Thread 2: ------------------------->*---
如果区域代码中不存在任务调度点,则 OpenMP 运行时可能会在其认为合适的时候启动任务。例如,所有任务都可能被推迟到到达parallel
区域末尾的屏障。
【讨论】:
+1,@Arkerone 是的,这是一个很好的解释,你也应该投赞成票:) 连续 3 个单曲和小节有很大区别吗? @HristoIliev 当任务杂注不在单个杂注中时,您是否有关于正在创建 num_threads 次的任务的来源?我在 IBM 的 OpenMP 文档中没有看到任何暗示这一点的内容。 @Chris,OpenMP 3.1 规范 §2.7.1:“当线程遇到任务构造时,会从相关结构化块的代码中生成任务。”除非有single/
master` 或工作共享结构或适当的条件,否则每个线程都会执行完全相同的代码,因此所有线程都会遇到 task
指令。
@JoeC, sections
是一个工作共享构造,这意味着与给定并行区域关联的团队中的所有线程都必须遇到它才能使构造成功。如果不希望空闲线程在隐式屏障处等待,则应用nowait
子句,该子句删除隐式屏障。【参考方案2】:
我不是 OpenMP 专家,但我尝试使用 task
和 sections
在我的机器上测试 fib 序列
部分
int fib(int n)
int i, j;
if (n < 2)
return n;
else
#pragma omp parallel sections
#pragma omp section
i = fib(n - 1);
#pragma omp section
j = fib(n - 2);
printf("Current int %d is on thread %d \n", i + j, omp_get_thread_num());
return i + j;
int main()
int n = 10;
#pragma omp parallel shared(n)
#pragma omp single
printf("%d\n", omp_get_num_threads());
printf("fib(%d) = %d\n", n, fib(n));
任务
#include <stdio.h>
#include <omp.h>
int fib(int n)
int i, j;
if (n<2)
return n;
else
#pragma omp task shared(i) firstprivate(n)
i=fib(n-1);
#pragma omp task shared(j) firstprivate(n)
j=fib(n-2);
#pragma omp taskwait
printf("Current int %d is on thread %d \n", i + j, omp_get_thread_num());
return i+j;
int main()
int n = 10;
#pragma omp parallel shared(n)
#pragma omp single
printf("%d\n", omp_get_num_threads());
printf ("fib(%d) = %d\n", n, fib(n));
部分结果:
12
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 1 is on thread 0
Current int 3 is on thread 0
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 5 is on thread 0
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 1 is on thread 0
Current int 3 is on thread 0
Current int 8 is on thread 0
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 1 is on thread 0
Current int 3 is on thread 0
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 5 is on thread 0
Current int 13 is on thread 0
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 1 is on thread 0
Current int 3 is on thread 0
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 5 is on thread 0
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 1 is on thread 0
Current int 3 is on thread 0
Current int 8 is on thread 0
Current int 21 is on thread 0
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 1 is on thread 0
Current int 3 is on thread 0
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 5 is on thread 0
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 1 is on thread 0
Current int 3 is on thread 0
Current int 8 is on thread 0
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 1 is on thread 0
Current int 3 is on thread 0
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 5 is on thread 0
Current int 13 is on thread 0
Current int 34 is on thread 0
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 1 is on thread 0
Current int 3 is on thread 0
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 5 is on thread 0
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 1 is on thread 0
Current int 3 is on thread 0
Current int 8 is on thread 0
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 1 is on thread 0
Current int 3 is on thread 0
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 5 is on thread 0
Current int 13 is on thread 0
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 1 is on thread 0
Current int 3 is on thread 0
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 5 is on thread 0
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 1 is on thread 0
Current int 3 is on thread 0
Current int 8 is on thread 0
Current int 21 is on thread 0
Current int 55 is on thread 4
fib(10) = 55
任务结果:
12
Current int 1 is on thread 3
Current int 2 is on thread 3
Current int 1 is on thread 8
Current int 2 is on thread 8
Current int 1 is on thread 8
Current int 1 is on thread 4
Current int 1 is on thread 11
Current int 1 is on thread 11
Current int 2 is on thread 11
Current int 3 is on thread 11
Current int 1 is on thread 11
Current int 2 is on thread 11
Current int 1 is on thread 11
Current int 1 is on thread 11
Current int 2 is on thread 11
Current int 3 is on thread 11
Current int 1 is on thread 11
Current int 2 is on thread 11
Current int 1 is on thread 11
Current int 1 is on thread 11
Current int 2 is on thread 11
Current int 3 is on thread 11
Current int 5 is on thread 11
Current int 8 is on thread 11
Current int 1 is on thread 8
Current int 2 is on thread 8
Current int 3 is on thread 8
Current int 5 is on thread 8
Current int 13 is on thread 8
Current int 1 is on thread 7
Current int 2 is on thread 7
Current int 1 is on thread 7
Current int 1 is on thread 7
Current int 1 is on thread 0
Current int 1 is on thread 0
Current int 2 is on thread 0
Current int 3 is on thread 0
Current int 1 is on thread 1
Current int 1 is on thread 6
Current int 2 is on thread 6
Current int 1 is on thread 9
Current int 2 is on thread 9
Current int 1 is on thread 2
Current int 2 is on thread 7
Current int 3 is on thread 7
Current int 5 is on thread 7
Current int 2 is on thread 5
Current int 5 is on thread 5
Current int 1 is on thread 5
Current int 2 is on thread 5
Current int 1 is on thread 5
Current int 1 is on thread 5
Current int 2 is on thread 5
Current int 3 is on thread 5
Current int 1 is on thread 5
Current int 2 is on thread 5
Current int 1 is on thread 5
Current int 1 is on thread 5
Current int 2 is on thread 5
Current int 3 is on thread 5
Current int 5 is on thread 5
Current int 1 is on thread 5
Current int 2 is on thread 5
Current int 1 is on thread 11
Current int 2 is on thread 11
Current int 1 is on thread 8
Current int 2 is on thread 8
Current int 5 is on thread 8
Current int 3 is on thread 1
Current int 8 is on thread 1
Current int 21 is on thread 1
Current int 1 is on thread 10
Current int 3 is on thread 10
Current int 8 is on thread 0
Current int 1 is on thread 4
Current int 3 is on thread 4
Current int 1 is on thread 9
Current int 3 is on thread 9
Current int 8 is on thread 9
Current int 3 is on thread 2
Current int 5 is on thread 3
Current int 13 is on thread 3
Current int 5 is on thread 6
Current int 13 is on thread 7
Current int 8 is on thread 10
Current int 21 is on thread 10
Current int 34 is on thread 3
Current int 55 is on thread 1
fib(10) = 55
在分配计算资源时,似乎任务比部分更明智
------------------编辑----------------- ------------
对于寻找此问题答案的人,请参阅此帖子下的评论。
【讨论】:
这两个代码示例不等价。带有部分的部分使用嵌套并行,即在每个递归调用上创建一个新的并行区域。默认情况下,嵌套并行是禁用的,所以除了***递归级别之外的任何东西都是由一个线程组成的团队运行,这就是为什么你会看到这么多线程 ID 等于 0。即使启用了嵌套并行,你最终可能会有数千个线程,这将是非常低效的。 @Hristo Iliev 那么我们可以使用sections
计算斐波那契吗?我的意思是,在使用 sections
时启用并行性
仅在非常有限的范围内。部分并不意味着解决递归问题。它们旨在解决程序线性执行中独立块的情况。
@Hristo Iliev 知道了以上是关于部分和任务openmp之间的区别的主要内容,如果未能解决你的问题,请参考以下文章