MPI_GATHERV (Fortran) 从二维子矩阵创建一个新的二维矩阵

Posted

技术标签:

【中文标题】MPI_GATHERV (Fortran) 从二维子矩阵创建一个新的二维矩阵【英文标题】:MPI_GATHERV (Fortran) to create a new 2D matrix from 2D sub matrices 【发布时间】:2021-10-04 11:11:17 【问题描述】:

我正在尝试将具有不同行数但具有相同列数的子二维数组收集到一个全局二维数组中。例如,假设使用 2 个 MPI 进程,第一个进程(即 rank == 0)有:

local = [11,12,13,14]

,第二个进程(即rank == 1)有:

local = [21,22,23,24
         31,32,33,34]

然后,我想将这两个数组连接成一个二维数组:

global = [11,12,13,14
          21,22,23,24
          31,32,33,34]

由于每个“本地”数组都有不同的行数,我(可能)想使用 mpi_gatherv(或 mpi_allgatherv)。我在这里发现了相同的问题:Using Gatherv for 2d Arrays in Fortran 和Using MPI_gatherv to create a new matrix from other smaller matrices,但我还是不太明白。所以,请教教我。这是我的示例代码:

program main
use mpi
implicit none
   
integer :: i, j
integer :: rank, npro, ierr
integer, allocatable :: local(:,:)
integer, allocatable :: global(:,:), displs(:), counts(:)
integer :: loc_size(2), glob_size(2), starts(2) 
integer :: newtype, int_size, resizedtype
integer(kind=MPI_ADDRESS_KIND) :: extent, begin
! End of local variables ==================================================!

call MPI_Init(ierr)
call MPI_Comm_rank(MPI_COMM_WORLD, rank, ierr)
call MPI_Comm_size(MPI_COMM_WORLD, npro, ierr)

! I will set local 2D arrays as: [1,4] for rank #0, and [2,4] for rank #1
! then, the global 2D array will be [3,4] (assuming I use 2 processes)
loc_size  = [rank+1,4]      ! [1,4], [2,4]
glob_size = [3,4]           ! I will use npro = 2

! allocate local and global arrays
allocate(local(loc_size(1),   loc_size(2))) ! [1,4], [2,4]
allocate(global(glob_size(1), glob_size(2)))! [3,4] ! if npro = 2

! set "local" array
!       rank = 0: [11, 12, 13, 14]
!       rank = 1: [21, 22, 23, 24
!                  31, 32, 33, 34]
if(rank == 0) then
   do j=1,4
      local(1,j) = 10 + j ! [11,12,13,14]
   end do
else if(rank == 1) then
   do i=1,2
      do j=1,4
         local(i,j) = (i+1)*10 + j ! [21,22,23,24; 31,32,33,34]
      end do
   end do
end if


! create a 2D subarray and set as "newtype" 
starts    = [0,0]              ! array start location
call MPI_Type_create_subarray(2, glob_size, loc_size, starts, &
&                             MPI_ORDER_FORTRAN, MPI_INTEGER, &
&                             newtype, ierr)

! get MPI_INTEGER type size in byte
! I don't quite understand the following processes...
! So, please comment on each step if possible...
call MPI_Type_size(MPI_INTEGER, int_size, ierr)
begin  =  0
extent =  (rank+1) * int_size ! rank 0 = 4 byte; rank 1 = 8 byte (am I doing correct here?)
call MPI_Type_create_resized(newtype, begin, extent, resizedtype, ierr) ! I dont' quite understand this process
call MPI_Type_commit(resizedtype, ierr)

! allocate index for mpi_gatherv
allocate(displs(npro)) ! [2], index for mpi_gatherv
allocate(counts(npro)) ! [2], index for mpi_gatherv

counts = [1,1]
do i =  1,npro
   displs(i) = (i-1) ! [0,1]
end do

call MPI_Gatherv(local, 1, MPI_INTEGER,               &
&                global, counts, displs, resizedtype, &
&                0, MPI_COMM_WORLD, ierr)

if(rank == 0) then
   do i=1,3
      write(*,*) (global(i,j), j=1,4)
   end do
end if

call MPI_Finalize(ierr)
end program main

提前谢谢你。

【问题讨论】:

如果我正确理解了您的global 图片并且您使用的是列优先存储,那么您确实需要Type_create_resized,因为来自进程的贡献将被交错。但是您不需要子数组类型。不试一试:创建一个连续类型并将范围降低到列大小。 你的代码真的很难调试,因为你的矩阵是正方形的,所以很难看出行和列的大小是否已经互换。将第二个(固定)维度设置为 4 而不是 3 会更容易调试,并将数据初始化为不同的值 - 它们当前每个等级都是恒定的,因此无法查看哪些数据流向何处。 @David Henty - 感谢您的评论。我已按照您的建议更改了代码。 @Victor Eijkhout - 你能用代码详细说明一下吗? @Jungwoo 抱歉,我没有时间为您编写程序。 1.您正在发送连续缓冲区,因此您的发送缓冲区很好。 2. 如果你写出发送缓冲区在接收缓冲区中的结束位置,你会看到一堆交错的跨步块。所以我认为您需要Type_vector,您可以在其中将范围调整为只有一个块的大小。您的 subarray 类型也是正确的,只是起点全为零,它等效于向量类型。只是更难阅读。只是好奇:让你这样做的背景是什么? 【参考方案1】:

我认为如果您更改存储顺序会容易得多(即让等级“i”初始化固定长度的“i+1”列),但以下代码似乎适用于您目前拥有的内容。我已打开调试输出,将列数更改为 4,在 3 个进程上运行(因此全局行数 = 1+2+3 = 6)并确保使用唯一数据初始化本地数组。

重要的一点是,您需要为每个等级使用不同的发送类型,因为步幅不同(因为本地数组具有不同的维度)。也许有一种更简单的方法可以做到这一点(不改变存储顺序),但至少这似乎可行。

请注意,cmets 不再与实际代码相关!

program main
use mpi
implicit none
   
integer :: i, j
integer :: rank, npro, ierr
integer, allocatable :: local(:,:)
integer, allocatable :: global(:,:), displs(:), counts(:)
integer :: loc_size(2), glob_size(2), starts(2) 
integer :: newtype, int_size, stype, resizedstype, rtype, resizedrtype
integer(kind=MPI_ADDRESS_KIND) :: extent, begin
! End of local variables ==================================================!

call MPI_Init(ierr)
call MPI_Comm_rank(MPI_COMM_WORLD, rank, ierr)
call MPI_Comm_size(MPI_COMM_WORLD, npro, ierr)

! I will set local 2D arrays as: [1,3] for rank #0, and [2,3] for rank #1
! then, the global 2D array will be [3,3] (assuming I use 2 processes)
loc_size  = [rank+1,4]      ! [1,3], [2,3]
glob_size = [6,4]           ! I will use npro = 3

! allocate local and global arrays
allocate(local(loc_size(1),   loc_size(2))) ! [1,3], [2,3]
allocate(global(glob_size(1), glob_size(2)))! [3,3] ! if npro = 2

! set "local" array
!       rank = 0: [0, 0, 0]
!       rank = 1: [1, 1, 1
!                  1, 1, 1]
do i=1,rank+1
   do j=1,4
      local(i,j) = 10*rank+4*(i-1)+j
   end do
end do

! check the local array 
 do i=1,rank+1
    write(*,*) 'rank = ', rank, 'local = ', (local(i,j), j=1,4)
 end do

! create a 2D subarray and set as send type stype 
loc_size= [1,4]
starts    = [0,0]              ! array start location
glob_size=[rank+1,4]
call MPI_Type_create_subarray(2, glob_size, loc_size, starts, &
&                             MPI_ORDER_FORTRAN, MPI_INTEGER, &
&                             stype, ierr)

! get MPI_INTEGER type size in byte
call MPI_Type_size(MPI_INTEGER, int_size, ierr)
begin  =  0
extent =  int_size

call MPI_Type_create_resized(stype, begin, extent, resizedstype, ierr)
call MPI_Type_commit(resizedstype, ierr)

 ! create a 2D subarray and set as receive type rtype 
loc_size=[1,4]
starts    = [0,0]              ! array start location
glob_size=[6,4]
call MPI_Type_create_subarray(2, glob_size, loc_size, starts, &
&                             MPI_ORDER_FORTRAN, MPI_INTEGER, &
&                             rtype, ierr)

! get MPI_INTEGER type size in byte
! I don't quite understand the following processes...
! So, please comment on each step if possible...
call MPI_Type_size(MPI_INTEGER, int_size, ierr)
begin  =  0
extent =  int_size

call MPI_Type_create_resized(rtype, begin, extent, resizedrtype, ierr)
call MPI_Type_commit(resizedrtype, ierr)

! allocate index for mpi_gatherv
allocate(displs(npro)) ! [2], index for mpi_gatherv
allocate(counts(npro)) ! [2], index for mpi_gatherv

counts = [1,2,3]
displs = [0,1,3]
call MPI_Gatherv(local, rank+1, resizedstype,               &
&                global, counts, displs, resizedrtype, &
&                0, MPI_COMM_WORLD, ierr)

if(rank == 0) then
   do i=1,6
      write(*,*) (global(i,j), j=1,4)
   end do
end if

call MPI_Finalize(ierr)
end program main

如果我在 3 个进程上运行,我会得到合理的结果:

 rank =            0 local =            1           2           3           4
 rank =            1 local =           11          12          13          14
 rank =            1 local =           15          16          17          18
 rank =            2 local =           21          22          23          24
 rank =            2 local =           25          26          27          28
 rank =            2 local =           29          30          31          32
           1           2           3           4
          11          12          13          14
          15          16          17          18
          21          22          23          24
          25          26          27          28
          29          30          31          32

【讨论】:

非常感谢您的回答。现在,我想我的问题是什么。我也会尝试维克多评论的方式。

以上是关于MPI_GATHERV (Fortran) 从二维子矩阵创建一个新的二维矩阵的主要内容,如果未能解决你的问题,请参考以下文章

使用“MPI_Gatherv”沿第 n 维堆叠数组

试图将一个连续的动态二维数组从 C 传递到 Fortran

试图将一个连续的动态二维数组从 C 传递到 Fortran

如何在 Fortran 中初始化二维数组

如何在 Fortran 中初始化二维数组

如何在 Fortran 中初始化二维数组