如何在 Perl 中读取和写入大缓冲区到进程 stdin/stdout/stderr?

Posted

技术标签:

【中文标题】如何在 Perl 中读取和写入大缓冲区到进程 stdin/stdout/stderr?【英文标题】:How do I read and write large buffers to a process stdin/stdout/stderr in Perl? 【发布时间】:2013-11-20 00:06:31 【问题描述】:

我想启动一个程序,向它的标准输入发送一些数据,读取它的标准输出/标准错误并返回所有内容。如果它运行的时间过长,我想杀死它。

我使用低级无缓冲 io 功能,分块执行所有操作,在使用 select 写入/读取之前检查,但它仍然失败....

使用以下脚本,cat 在写回 5000 字节后在 bash 手册页 (300KB) 上超时。 为什么

$ perl bar.pl
timeout :( at bar.pl line 119.
RES = 9
ERR = <>
IN  = <BASH(1)                       ...> (294439 chars)
OUT = <BASH(1)                       ...> (5000 chars)    

更新:当我使用单个 IO::Select 对象并停止捕获 stderr 时,它几乎可以在没有超时的情况下恢复所有内容。

差异:

--- old.pl  2013-11-20 17:08:19.000000000 +0100
+++ new.pl  2013-11-20 17:07:58.000000000 +0100
@@ -26,10 +26,6 @@

     my $bufsize = 100;

-    my $insel = IO::Select->new();
-    my $outsel = IO::Select->new();
-    my $errsel = IO::Select->new();
-
     $fderr = gensym;
     my $pid = open3($fdin, $fdout, $fderr, $cmd, @args) or die "open3 $!";

@@ -39,9 +35,9 @@
     my $len = length($progin);
     my $off = 0;

-    $insel->add($fdin);
-    $outsel->add($fdout);
-    $errsel->add($fderr);
+    my $sel = IO::Select->new();
+    $sel->add($fdin);
+    $sel->add($fdout);

     if ($len <= 0) 
         close $fdin;
@@ -59,7 +55,7 @@
             last;
         

-        if ($len > 0 && $insel->can_write($select_timeout)) 
+        if ($len > 0 && $sel->can_write($select_timeout)) 
             $ret = syswrite $fdin, $progin, $bufsize, $off;
             if (!defined $ret) 
                 warn "in ndef";
@@ -74,7 +70,7 @@
             
         

-        if ($outsel->can_read($select_timeout)) 
+        if ($sel->can_read($select_timeout)) 
             my $buf;
             $ret = sysread($fdout, $buf, $bufsize);
             if(!defined $ret) 
@@ -84,16 +80,6 @@
             $progout .= $buf;
         

-        if ($errsel->can_read($select_timeout)) 
-            my $buf;
-            $ret = sysread($fderr, $buf, $bufsize);
-            if(!defined $ret) 
-                warn "err ndef";
-                last;
-            
-            $progerr .= $buf;
-        
-
         $ret = waitpid($pid, WNOHANG);
         # still exists, continue
         if ($ret == 0) 

脚本:

#!/usr/bin/perl
use Data::Dumper;
use strict;
use warnings;
use IPC::Open3;
use Symbol 'gensym';
use Time::HiRes 'time';
use POSIX ':sys_wait_h';
use IO::Select;
use Getopt::Std;
use File::Temp;


my $in = `man -P cat bash`;
my ($res, $out, $err) = run_prog($in, 5, 'cat');

print "RES = $res\n";
print "ERR = <$err>\n";
printf "IN  = <%s...> (%d chars)\n", substr($in, 0, 30), length($in);
printf "OUT = <%s...> (%d chars)\n", substr($out, 0, 30), length($out);

sub run_prog 
    my ($progin, $timeout, $cmd, @args) = @_;
    my ($progres, $progout, $progerr);
    my ($fdin, $fdout, $fderr);

    my $bufsize = 100;

    my $insel = IO::Select->new();
    my $outsel = IO::Select->new();
    my $errsel = IO::Select->new();

    $fderr = gensym;
    my $pid = open3($fdin, $fdout, $fderr, $cmd, @args) or die "open3 $!";


    my $start = time;
    my $ret;
    my $len = length($progin);
    my $off = 0;

    $insel->add($fdin);
    $outsel->add($fdout);
    $errsel->add($fderr);

    if ($len <= 0) 
        close $fdin;
    

    $progout = '';
    $progerr = '';

    my $select_timeout = 0.1;

    my $toolong = 0;
    while (1) 
        if (time - $start > $timeout) 
            $toolong = 1;
            last;
        

        if ($len > 0 && $insel->can_write($select_timeout)) 
            $ret = syswrite $fdin, $progin, $bufsize, $off;
            if (!defined $ret) 
                warn "in ndef";
                last;
            

            $off += $ret;
            $len -= $ret;

            if ($len <= 0) 
                close $fdin;
            
        

        if ($outsel->can_read($select_timeout)) 
            my $buf;
            $ret = sysread($fdout, $buf, $bufsize);
            if(!defined $ret) 
                warn "out ndef";
                last;
            
            $progout .= $buf;
        

        if ($errsel->can_read($select_timeout)) 
            my $buf;
            $ret = sysread($fderr, $buf, $bufsize);
            if(!defined $ret) 
                warn "err ndef";
                last;
            
            $progerr .= $buf;
        

        $ret = waitpid($pid, WNOHANG);
        # still exists, continue
        if ($ret == 0) 
            next;
        
        # process exited/signaled
        # make a last read
        elsif ($ret > 0) 
            $progres = $?;
            next;
        
        # process doesn't exists anymore
        else 
            last;
        
    

    close $fdout;
    close $fderr;

    # timeout
    if ($toolong) 
        warn "timeout :(";
        kill 9, $pid;
        waitpid($pid, 0);
        $progres = $?;
    

    return ($progres, $progout, $progerr);

【问题讨论】:

【参考方案1】:

我已经使用这个很久了,并且可以通过将用于 select() 调用的 3 个变量替换为添加所有 3 个文件描述符的单个变量来使其工作。所以代码看起来像这样:

my $sel = IO::Select->new();
$sel->add($fdin);
$sel->add($fdout);
$sel->add($fderr);

及以后:

$sel->can_write($select_timeout);

$sel->can_read($select_timeout)

【讨论】:

仍然不起作用(写入 5000 字节并超时)。当我从$sel 中删除$fderr$fderr 上的if ($sel-&gt;can_read(...) 时(即,当我只捕获标准输出时)它会写入321000 个字节(几乎好,而不是321071)没有超时。 【参考方案2】:

好的,我又看了一些,这每次都有效,并且不会丢失任何字节。我会让你重新添加错误处理的东西。

#!/usr/bin/perl
use Data::Dumper;
use strict;
use warnings;
use IPC::Open3;
use Symbol 'gensym';
use Time::HiRes 'time';
use POSIX ':sys_wait_h';
use IO::Select;
use Getopt::Std;
use File::Temp;


my $in = 'a' x 300000;  # Just get 300000 bytes without messing around
my ($res, $out, $err) = run_prog($in, 5, 'cat');

print "RES = $res\n";
print "ERR = <$err>\n";
printf "IN  = <%s...> (%d chars)\n", substr($in, 0, 30), length($in);
printf "OUT = <%s...> (%d chars)\n", substr($out, 0, 30), length($out);

sub run_prog 
    my ($progin, $timeout, $cmd, @args) = @_;
    my ($progres, $progout, $progerr);
    my ($fdin, $fdout, $fderr);

    my $bufsize = 100;    # Or use 4, or 800,000 - they all work now!

    $fderr = gensym;
    my $pid = open3($fdin, $fdout, $fderr, $cmd, @args) or die "open3 $!";

    my $write_set = IO::Select->new($fdin);
    my $read_set  = IO::Select->new($fdout);

    my $start = time;
    my $ret;
    my $len = length($progin);
    my $off = 0;

    if ($len <= 0) 
        close $fdin;
    

    $progout = '';
    $progerr = '';

    my $select_timeout = 0.1;

    my $toolong = 0;
    while (1) 
        if (time - $start > $timeout) 
            $toolong = 1;
            last;
        

        if ($len > 0 && IO::Select->select(undef,$write_set,undef,$select_timeout)) 
            $ret = syswrite $fdin, $progin, $bufsize, $off;
            if (!defined $ret) 
                warn "in ndef";
                last;
            

            $off += $ret;
            $len -= $ret;

            if ($len <= 0) 
                close $fdin;
            
        

        if (IO::Select->select($read_set,undef,undef,$select_timeout)) 
            my $buf;
            $ret = sysread($fdout, $buf, $bufsize);
            if(!defined $ret) 
                warn "out ndef";
                last;
            
            $progout .= $buf;
        

        $ret = waitpid($pid, WNOHANG);
        # still exists, continue
        if ($ret == 0) 
            next;
        
        # process exited/signaled
        # make a last read
        elsif ($ret > 0) 
            $progres = $?;
            next;
        
        # process doesn't exists anymore
        else 
            last;
        
    

    close $fdout;
    close $fderr;

    # timeout
    if ($toolong) 
        warn "timeout :(";
        kill 9, $pid;
        waitpid($pid, 0);
        $progres = $?;
    

    return ($progres, $progout, $progerr);

【讨论】:

它工作!做得好!我仍然不明白是什么原因,但谢谢! 啊!其实还是不行!如果程序消耗数据的速度不够快,它将再次阻塞并超时...尝试perl -e '$b=0; while(read(STDIN, $b, 1))print $b;' 而不是cat【参考方案3】:

由于没有人成功完成这项工作,我在 CPAN 中更加努力地寻找并找到了 IPC::Run 模块。

use IPC::Run;
sub run_prog 
    my ($in, $t, $cmd, @args) = @_;
    my ($out, $err);
    IPC::Run::run([$cmd, @args], \$in, \$out, \$err, IPC::Run::timeout($t));
    return ($?, $out, $err);    

【讨论】:

【参考方案4】:

行:

  if ($len > 0 && $insel->can_write($select_timeout)) 

不应该是这样的

...outsel->can_write...

【讨论】:

没有。这是子进程的标准输入。你应该写在里面。 我注意到如果我将缓冲区大小增加 10 倍,它读取的数据量是 10 倍。事实上,无论我做什么,它似乎都会读取 50 次 - 也许这会帮助某人(比我更聪明!)诊断问题。我还注意到,如果将 select_timeout 增加到 1 秒,它会完全挂起。 我还将man -P cat bash 的内容更改为更易于理解并提供已知量(1 兆)数据的内容,例如dd if=/dev/zero bs=1m count=1

以上是关于如何在 Perl 中读取和写入大缓冲区到进程 stdin/stdout/stderr?的主要内容,如果未能解决你的问题,请参考以下文章

Perl 缓冲区刷新

linux重点知识总结

Netty源码之写入数据

如何读取和写入位到字节数组

关于环形缓冲区

当子进程和父进程在 Perl 中写入同一个日志文件时进程卡住(在 Windows 中)