查找过去一小时内修改的文件的最简单方法是啥？

Posted 2023-03-24

技术标签:

【中文标题】查找过去一小时内修改的文件的最简单方法是啥？【英文标题】：What is the simplest way to find files modified within the past hour?查找过去一小时内修改的文件的最简单方法是什么？ 【发布时间】：2022-01-18 22:11:41 【问题描述】：

这个问题是一个特例 “在 perl 中，查找在给定时间间隔内修改的文件的最简单方法是什么？” 并且下面的解决方案可以很容易地推广。在 bash 中，我们可以输入

> find . -mtime -1h

但我想在纯 perl 中执行此操作。下面的代码就是这样做的。它在每个文件上显式运行stat。

有没有什么方法可以让它变得更简单或更优雅？当然，这比 bash 命令慢。我并不是想与 bash 竞争效率。我只是想纯粹用 perl 来做。

#!/usr/bin/env perl 
use strict; use warnings;
use File::Find;
my $invocation_seconds=time;
my $interval_left =  $invocation_seconds - (60 * 60); # one hour ago
my $count_all=0;
my @selected;
find(
    sub
    
        $count_all++;
        my $mtime_seconds=(stat($_))[9];
        return unless defined $mtime_seconds; # if we edit files while running current script, this can be undef on occasion 
        return unless ($mtime_seconds>$interval_left);
        push@selected,$File::Find::name;
    
    ,
    '.', # current directory 
);
my $end_seconds=time;
my $totalselected=scalar@selected;
print ($_,"\n",)for@selected;
print $^V; print " <- perl version\n";
print 'selected ',$totalselected, '/',$count_all,' in ',($end_seconds-$invocation_seconds),' seconds',"\n";

【问题讨论】：

【参考方案1】：

use File::Find::Rule qw( );

my $cutoff = time - 60*60;

say for File::Find::Rule->mtime( ">=$cutoff" )->in( "." );

【讨论】：

我认为 File::Find::Rule 没有添加功能，也没有提高性能，而是提供了一种替代语法来做与普通相同的事情，这是正确的吗？文件::找到？ @JacobWegelin 对。这对那些根本没有帮助。但它确实对所有重要的事情都有帮助。可读性、可维护性、可靠性、正确性等【参考方案2】：

以下演示示例代码利用函数stat 过滤掉最近一小时修改过的文件。

注意：代码稍作修改即可采用递归方式

use strict;
use warnings;
use feature 'say';

use constant STAMP => time() - 60*60;
use constant MTIME => 9;

my $dir = shift || '.';
my @files;

(stat($_))[MTIME] > STAMP && push @files, $_
    for glob("$dir/*");

say "
Files changes in last hour
--------------------------";
say for @files;

报告查找时间的递归版本

use strict;
use warnings;
use feature 'say';

use Time::HiRes qw(gettimeofday tv_interval);

use constant STAMP => time() - 60*60;
use constant MTIME => 9;

my $dir   = shift || '.';

my $start = [gettimeofday];
my $found = lookup($dir);
my $elapsed = tv_interval($start)

say "
Elapsed lookup time: $elapsed seconds";

say "
Files changed in last hour
--------------------------";
say for @$found;

exit 0;

sub lookup 
    my $dir  = shift;
    my $files;

    for my $item ( glob("$dir/*") ) 
        if( -d $item ) 
            my $matched = lookup($item);
            push @$files, $_ for @$matched;
        
        next unless (stat($item))[MTIME] > STAMP;
        push @$files, $item if -f $item;
    

    return $files;

【讨论】：

1) 这不是递归的。 2) 它没有提到名称以. 开头的文件。 3) 它存在代码注入错误，可能会阻止代码使用正常输入。 4）这可以使用grep来简化。好的，现在你添加了一个递归函数。 1）它不打印新修改的目录。 2) 它没有提到名称以. 开头的文件。 3) 它存在代码注入错误，可能会阻止代码使用正常输入。 4) 为每个文件调用 3 个stat。（哇！！！） 5）它并不比原来的简单。（有点清洁，当然，但你可以把原来的清理干净。） @ikegami -- 我没有看到 OP 在他的问题中提到“新修改的目录”。你会用什么来代替stat 来获取文件上的mtime？（find 利用fstat 系统调用）或者有一个系统调用我不知道哪个返回基于标记标准的文件列表？可能我会从你那里学到一些新东西，如何在不调用fstat 的情况下获取文件的mtime。我对这个问题的理解是如何使提供的代码看起来更好？——正如 OP 提到的 pure perl 那么一种有效的方法是避免使用外部模块。 File::stat 非常适合符号访问stat 信息，顺便说一句。 Re "你会用什么代替 stat 来获取文件的 mtime？"，Strawman 我没有说你不应该使用stat；我说过你不应该每个文件使用它三次。你应该使用一次stat，然后使用另外两次_。

以上是关于查找过去一小时内修改的文件的最简单方法是啥？的主要内容，如果未能解决你的问题，请参考以下文章