OpenACC 计算圆周率(简单版)

Posted cuancuancuanhao

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了OpenACC 计算圆周率(简单版)相关的知识,希望对你有一定的参考价值。

? 书上的计算圆周率的简单程序,主要是使用了自定义函数

 1 #include <stdio.h>
 2 #include <stdlib.h>
 3 #include <math.h>
 4 #include <openacc.h>
 5 
 6 #define N   100
 7 
 8 #pragma acc routine seq
 9 float ff(const float x)
10 {    
11     return 4.0f / (1.0f + x * x);
12 }
13 
14 int main()
15 {
16     const float h = 1.0f / N;
17     float sumf = 0, result;
18            
19 #pragma acc parallel loop reduction(+:sumf)
20     for (int i = 0; i < N; i++)
21         sumf += ff(h * (i - 0.5f));
22 
23     result = h * sumf;    
24     printf("
N = %d, myPi = %f, diff = %e
", N, result, result / 3.141592653589793238 - 1);
25     //getchar();
26     return 0;
27 }

● 输出结果

D:CodeOpenACCOpenACCProjectOpenACCProject>pgcc main.c -acc -Minfo -o main_acc.exe
ff:
     10, Generating acc routine seq
         Generating Tesla code
     11, FMA (fused multiply-add) instruction(s) generated
main:
     19, Accelerator kernel generated
         Generating Tesla code
         20, #pragma acc loop gang, vector(100) /* blockIdx.x threadIdx.x */
             Generating reduction(+:sumf)
     19, Generating implicit copy(sumf)

D:CodeOpenACCOpenACCProjectOpenACCProject>main_acc.exe
launch CUDA kernel  file=D:CodeOpenACCOpenACCProjectOpenACCProjectmain.c function=main line=19 device=0 threadid=1 num_gangs=1 num_workers=1 vector_length=100 grid=1 block=100 shared memory=1024
launch CUDA kernel  file=D:CodeOpenACCOpenACCProjectOpenACCProjectmain.c function=main line=19 device=0 threadid=1 num_gangs=1 num_workers=1 vector_length=256 grid=1 block=256 shared memory=1024

N = 100, myPi = 3.161500, diff = 6.336546e-03
PGI: "acc_shutdown" not detected, performance results might be incomplete.
 Please add the call "acc_shutdown(acc_device_nvidia)" to the end of your application to ensure that the performance results are complete.

Accelerator Kernel Timing data
D:CodeOpenACCOpenACCProjectOpenACCProjectmain.c
  main  NVIDIA  devicenum=0
    time(us): 11
    19: compute region reached 1 time
        19: kernel launched 1 time
            grid: [1]  block: [100]
            elapsed time(us): total=1000 max=1000 min=1000 avg=1000
        19: reduction kernel launched 1 time
            grid: [1]  block: [256]
             device time(us): total=0 max=0 min=0 avg=0
    19: data region reached 2 times
        19: data copyin transfers: 1
             device time(us): total=4 max=4 min=4 avg=4
        23: data copyout transfers: 1
             device time(us): total=7 max=7 min=7 avg=7

 

以上是关于OpenACC 计算圆周率(简单版)的主要内容,如果未能解决你的问题,请参考以下文章

使用 openMP 和 openACC 的多线程多 GPU 计算

OpenACC 计算构建内的自定义函数

OpenACC 简单的原子操作

OpenACC 简单的直方图

OpenACC中的嵌套指令

OpenACC nvlink 未定义类的引用