如何从 Fortran 模式匹配的行开始读取数据?
Posted
技术标签:
【中文标题】如何从 Fortran 模式匹配的行开始读取数据?【英文标题】:How to READ data starting from a pattern-matched line with Fortran? 【发布时间】:2020-11-02 10:00:30 【问题描述】:我有一个文件pos.xyz
,格式如下,其中i = 6
等代表帧索引。 (这里,第一帧有i = 6
。一般情况下,第一帧的索引可以是i = 0
,i = 1
,或者i = 2
,...)
我想实现一个函数:对于 任意两个给定整数 a
和 b
,(a<b
,例如 7 和 9),请阅读
将帧索引 7 到 9 的数据放入一个数组中。你能给我一个关于如何实现这个想法的建议吗?
4
i = 6, time = 3.000, E = -205.1846561900
O 2.6028572470 4.1666579520 12.7865910725
O 6.5415232423 8.8963227363 17.7533721708
O 15.6020396800 11.9022808314 15.2930838049
O 11.2843786793 13.2653367176 13.8186352548
4
i = 7, time = 3.500, E = -205.1845561905
O 5.1072569275 11.9945026418 4.1254340934
O 2.5299942732 11.4124710424 9.5495912455
O 14.8837181647 12.6571252157 7.8905997802
O 15.1684493877 10.7315923081 2.6631494700
4
i = 8, time = 4.000, E = -205.1846261900
O 2.6028572470 4.1666579520 12.7865910725
O 6.5415232423 8.8963227363 17.7533721708
O 15.6020396800 11.9922808314 15.2930838049
O 11.2843786793 13.2653367176 13.8186352548
4
i = 9, time = 4.500, E = -205.1846561805
O 5.1072569375 11.9945026418 4.1258340934
O 2.5299942732 11.4124710424 9.5495912455
O 14.8837181647 12.6570252157 7.8905997802
O 15.1684493877 10.7310923081 2.6630494700
4
i = 10, time = 5.000, E = -205.1846551805
O 5.1072569275 11.9945026418 4.1254340934
O 2.5299932732 11.4129710424 9.5495912455
O 14.8837181647 12.6571252157 7.8905997802
O 15.1684473877 10.7313923081 2.6631494700
我做了什么:对于以i = 0
作为第一帧的特殊案例。例如,如果我想从第 3 帧开始读取,我可以先跳过(m+2)*(3-1)
行,然后读取数据,m=4
。函数如下。
SUBROUTINE skip_lines(indx, i_input)
! Purpose:
! To skip lines when read data from the input
IMPLICIT NONE
INTEGER :: i
INTEGER,INTENT(IN) :: i_input,indx
do i=1,i_input
read(indx,*) !Neglect (nat+2)*(ns-1) lines
enddo
END SUBROUTINE skip_lines
但是对于一般情况,如果第一帧有一个非零帧数,这个想法是没有效率的。我希望找到更好的方法来实现它。
【问题讨论】:
有几种方法可以做到这一点(包括读入line buffer and testing),但像read(unit, '(A3,I6)') x, y; if (x=='i ='.and.(y>a.and.y<b) ...)
这样简单的方法可能就足够了。请研究这些方法,看看您是否需要我们的进一步帮助。
文件的格式是否像样本所暗示的那样固定和规则?在文件的第二行中找到i
的值是唯一(或主要)问题吗?
@HighPerformanceMark 是的,如示例所示,格式是常规的;是的,如果我能找到i
的位置,那么我认为可以使用BACKSPACE
两次读取i = a
所在的块。
所以您阅读并忽略第一行,然后按照@francescalus 的评论建议阅读第二行,正如他们所说,鲍勃是您母亲的兄弟。
它读取一行然后测试它是否看起来像一行" i = a"
(如果我们认为条件为y>=a
)。如果您的行确实看起来像那样,那么将接下来的几行作为匹配框架读取;如果没有,请转到下一行并重新测试。就其本身而言,一条语句并不能解决您的问题,但没有一条语句可以解决您的问题。
【参考方案1】:
gfortran -Wall -fcheck=all parameter_shared.f95 atom_module.f95 traj.f95 sample.f95 test.f95 -o test.x
! test.f95
PROGRAM test
! Purpose: To read data starting from any block.
USE atom_module
IMPLICIT NONE
!==========
!parameters
!==========
INTEGER :: ns ! Get one sample from the trajectory every ns step.
INTEGER :: nmo_start
INTEGER :: nmo_end
INTEGER :: nat ! number of atoms
REAL(kind=4) :: delta_t0 ! For reading data
character(LEN=200) :: pos_filename
!===============
! Initialization
delta_t0 = 0.0005; ns = 2
nmo_start = 7; nmo_end = 10
nat = 4; pos_filename="pos.xyz"
!========================
! Sampling the trajectory
CALL sample(pos_filename,nmo_start,nmo_end,nat,ns)
END PROGRAM test
! sample.f95
SUBROUTINE sample(pos_filename,nmo_start,nmo_end,nat,ns)
USE parameter_shared
USE atom_module, ONLY: atom_info
USE traj
IMPLICIT NONE
!==========
!Parameters
!==========
character(LEN=*), INTENT(IN) :: pos_filename
INTEGER, INTENT(IN) :: nmo_start
INTEGER, INTENT(IN) :: nmo_end
INTEGER, INTENT(IN) :: nat ! number of atoms
INTEGER, INTENT(IN) :: ns ! Get one sample from the trajectory every ns step.
!Local varables
INTEGER :: n_samples !n_samples = INT(nmo/ns)
INTEGER :: iatom,imovie,i
!Initialization
iatom = 0; imovie =0; i =0
! Obatin n_samples
n_samples = sampling_number(nmo_start,nmo_end,ns)
allocate(sampled_movie(n_samples))
allocate(sampled_time(n_samples))
allocate(sampled_energy(n_samples))
!=======================
!read in trajectory file
!=======================
open(10,file=trim(pos_filename))
CALL read_traj(10,nmo_start,nmo_end,ns,nat,n_samples)
close(10)
write(6,*) 'End of trajectory reading.'
!=============
!write in file
!=============
sampled_pos_filename = 'pos_sampled.xyz'
open(10,file=sampled_pos_filename)
do i =1,n_samples
write (10,'(I8)') nat
WRITE(10,100) 'i =',i-1,', time =',sampled_time(i),', E =',sampled_energy(i)
100 FORMAT (1X,A3,I10,A8,F10.3,A5,F20.10)
DO iatom = 1, nat
WRITE(10,*) TRIM(atom_info(iatom, i)%atom_name), &
atom_info(iatom,i)%coord(1), &
atom_info(iatom,i)%coord(2), &
atom_info(iatom,i)%coord(3)
ENDDO
enddo
write(6,*)'Sampled trajectory is written in: ', sampled_pos_filename
close(10)
deallocate(sampled_movie, sampled_time,sampled_energy)
END SUBROUTINE sample
MODULE traj
IMPLICIT NONE
CONTAINS
INTEGER FUNCTION sampling_number(nmo_start,nmo_end,ns)
!To calculate the total numbers of samples one want to include
INTEGER,INTENT(IN) :: ns ! Get one sample from the trajectory every ns step.
INTEGER,INTENT(IN) :: nmo_start, nmo_end
write(*,*) 'In function sampling_number: nmo_end = ', nmo_end
positive: IF (nmo_end <0 .OR. nmo_start < 0 .OR. ns <0) THEN
write(*,*) 'Please enter non-negative values for the ns, starting step and ending step.'
ELSE IF (nmo_end < nmo_start) THEN
write(*,*) 'Please note that starting step shoud not larger than ending step.'
ELSE IF (ns ==0) THEN
sampling_number = nmo_end-(nmo_start-1)
ELSE IF (nmo_end-(nmo_start-1) <= ns) THEN
sampling_number = INT((nmo_end-(nmo_start-1))/ns + 1)
ELSE IF (nmo_end-(nmo_start-1) > ns) THEN
sampling_number = INT((nmo_end-(nmo_start-1))/ns)
END IF positive
END FUNCTION sampling_number
SUBROUTINE read_traj(indx,nmo_start,nmo_end,ns,nat,n_samples)
! Purpose: to READ data starting from a pattern-matched line.
USE atom_module, ONLY: atom_info
USE parameter_shared, ONLY: sampled_movie, sampled_time, sampled_energy
INTEGER :: iatom,i_sample
INTEGER, INTENT(IN) :: nat
INTEGER, INTENT(IN) :: n_samples !n_samples = INT(nmo/ns)
INTEGER, INTENT(IN) :: indx
INTEGER, INTENT(IN) :: ns ! Get one sample from the trajectory every ns step.
INTEGER, INTENT(IN) :: nmo_start, nmo_end ! To get the total number of moves
CHARACTER(LEN=4) :: head_char
INTEGER :: y
allocate(atom_info(nat,n_samples))
i_sample = 1
write(*,*) "read_traj(): New total time steps (n_samples):", n_samples
DO WHILE (i_sample < n_samples+1) ! +1 means i_sample can take the value of n_samples
read(indx, '(A4)') head_char
PRE_CHECK:IF (head_char=="i = ") THEN
BACKSPACE(UNIT=indx) ! Because I am not able to read other lines with the format '(A4,I8)', and have not find any good way, so I try to read it in '(A4)' first
read(indx, '(A4,I8)') head_char, y
CHECK_HEAD:IF (head_char=="i = " .AND. (y>nmo_start-1 .and. y<nmo_end+1) .AND. MOD(y-(nmo_start-1),ns) == 1) THEN
WRITE(*,*)"read_traj():", head_char, y
BACKSPACE(UNIT=indx) ! Because we have to read the whole line with ' i = ' line.
read(indx,130) sampled_movie(i_sample), sampled_time(i_sample), sampled_energy(i_sample)
130 FORMAT (4X,I8,9X,F12.3,6X,F20.10)
131 FORMAT (A4,3F20.10)
inner: do iatom= 1,nat
read (indx,131) atom_info(iatom, i_sample)%atom_name, atom_info(iatom,i_sample)%coord(1), &
atom_info(iatom,i_sample)%coord(2), atom_info(iatom,i_sample)%coord(3)
enddo inner
i_sample = i_sample + 1
ENDIF CHECK_HEAD
ENDIF PRE_CHECK
END DO
END SUBROUTINE read_traj
END MODULE traj
MODULE atom_module
! To define the derived data type for atom
IMPLICIT NONE
TYPE :: atom
CHARACTER(LEN=2) :: atom_name
INTEGER :: atom_id
INTEGER :: host_id ! For O atom in water, host_id = atom_id
REAL :: mass
REAL, DIMENSION(3) :: coord
END TYPE atom
! The array atom_info can be shared by subroutines
TYPE(atom), ALLOCATABLE, DIMENSION(:,:) :: atom_info
END MODULE atom_module
MODULE parameter_shared
!
! Purpose:
! To declare data to share between routines.
IMPLICIT NONE
!SAVE
character(LEN=200) :: sampled_pos_filename
INTEGER, ALLOCATABLE, DIMENSION(:) :: sampled_movie
REAL, ALLOCATABLE, DIMENSION(:) :: sampled_time, sampled_energy
END MODULE parameter_shared
【讨论】:
【参考方案2】:感谢@francescalus 和@High Performance Mark 的建议。我使用 DO WHILE 循环,并实现了我的想法。我把我的子程序的一个简化版本放在这里。它包括在模块中定义的一些类型,这不是这里重要的事情。现在,它可以
(1) 读取任意步a
到任意步b
的轨迹文件,其中a
和b
由用户给出;
(2) 每ns
步读取数据。
SUBROUTINE read_traj(indx,nmo_start,nmo_end,ns,nat,n_samples)
! goal:
! read info from the trajectory file (format: ***.xyz)
! read data from frame a to frame b
USE atom_module
USE parameter_shared
INTEGER :: iatom, i_sample
INTEGER, PARAMETER:: nat = 4
INTEGER :: n_samples !n_samples = INT((a-b)/ns)
INTEGER, PARAMETER :: indx = 10
INTEGER, PARAMETER :: ns = 2 ! read one sample from the trajectory every ns step.
INTEGER, PARAMETER :: a =7
INTEGER, PARAMETER :: b=10
CHARACTER(LEN=4) :: x
INTEGER :: y
allocate(atom_info(nat,n_samples))
i_sample = 1
DO WHILE (i_sample < n_samples)
read(indx, '(A3,I5)') x, y
CHECK: IF (head_char=="i = " .AND. (y>a-1 .and. y<b+1) .AND. MOD(y-(a-1),ns) == 1) THEN
WRITE(*,*)"head_char and y:", x, y
BACKSPACE(UNIT=indx) ! we have to read the whole line with ' i = ' line.
read(indx,120) sampled_movie(i_sample), sampled_time(i_sample), sampled_energy(i_sample)
120 FORMAT (3X,I5,8X,F9.3,5X,F20.10)
inner: do iatom= 1,nat
read (indx,*) atom_info(iatom, i_sample)%atom_name, atom_info(iatom,i_sample)%coord(1), &
atom_info(iatom,i_sample)%coord(2), atom_info(iatom,i_sample)%coord(3)
enddo inner
i_sample = i_sample + 1
ENDIF CHECK
END DO
END SUBROUTINE read_traj
【讨论】:
我很惊讶该代码可以编译 - 我原以为无法声明常规参数(例如indx
、nat
、etc),在例程中,为parameter
s。即使您以某种方式说服编译器接受该代码,在例程中声明参数并(尝试)将它们也作为参数传递也没有任何意义。
确实,我承认这段代码不完整,无法编译。因为我一开始只是想展示我解决问题的想法。它也不包括模块。我添加了另一个答案,其中包含所有详细信息并且可以编译。主要部分是模块traj
中的函数read_traj()
中的DO WHILE
循环。以上是关于如何从 Fortran 模式匹配的行开始读取数据?的主要内容,如果未能解决你的问题,请参考以下文章