从子进程调用的外部 python 脚本打印 tqdm 进度条

Posted

技术标签:

【中文标题】从子进程调用的外部 python 脚本打印 tqdm 进度条【英文标题】:Print tqdm progress bar from external python script called by subprocess 【发布时间】:2021-04-09 08:31:06 【问题描述】:

我的主要目标是通过subprocess 在另一个python 脚本(调用者脚本)中运行一个外部python 脚本(客户端脚本)。调用者脚本的控制台显示来自客户端脚本的所有输出除了 tqdm 输出 - 所以这不是subprocess 显示输出的一般问题,而是与subprocess 交互相关的特定问题tqdm

我的次要目标是我想了解它:)。非常感谢您提供如此周到的解释。

客户端脚本 (train.py) 包含多个 tqdm 调用。到目前为止,我还没有看到各种 tqdm 参数配置之间的输出有太大差异,所以让我们使用最简单的。

train.py:

...
from tqdm import tqdm

with tqdm(total = 10, ncols = 80,
          file=sys.stdout, position = 0, leave = True,
          desc='f5b: pbar.set_postfix') as pbar:
    for i in range(10):
        pbar.update(1)
        postfix = 'loss': '0:.4f'.format(1+i)
        pbar.set_postfix(**postfix)
        sleep(0.1)

调用者脚本experiment.py执行函数execute_experiment,该函数通过参数train.py调用command_list

def execute_experiment(command_list):
    tic = time.time()
    try:
        process = subprocess.Popen(
            command_list, shell=False, 
            encoding='utf-8',
            bufsize=0,
            stdin=subprocess.DEVNULL,
            universal_newlines=True,
            stdout=subprocess.PIPE, 
            stderr=subprocess.PIPE
            )
        # Poll process for new output until finished
        # Source: https://***.com/q/37401654/7769076
        while process.poll() is None:
            nextline = process.stdout.readline()
            sys.stdout.write(nextline)
            sys.stdout.flush()

    except CalledProcessError as err:
        print("CalledProcessError: 0".format(err))
        sys.exit(1)

    except OSError as err:
        print("OS error: 0".format(err))
        sys.exit(1)

    except:
        print("Unexpected error:", sys.exc_info()[0])
        raise

    if (process.returncode == 0):
        toc = time.time()
        time1 = str(round(toc - tic))
        return time1
    else:
        return 1

此脚本调用上述从 train.py 截取的代码确实返回输出,但 tqdm 输出在 0 秒后停止,如下所示:

f5b: pbar.set_postfix:   0%|                             | 0/10 [00:00<?, ?it/s]
f5b: pbar.set_postfix:  10%|█▊                | 1/10 [00:00<00:00, 22310.13it/s]

脚本调用train.py原代码返回所有输出除了tqdm输出:

Training default configuration
train.py data --use-cuda ...
device: cuda
...

评论:

    shell = False:因为python脚本调用python脚本。 shell=True时,根本不调用客户端脚本 bufsize=0: 防止缓冲 train.py 调用前面带有sys.executable,以确保在本地计算机上调用相应 conda 环境的 python 解释器。

问题:

    tqdm.set_postfix 是否会阻止将进度条输出传递到上游?我知道在调用 tqdm.set_description 时会发生这种情况,例如作者:

    pbar.set_description('已处理:%d' %(1 + i))

这段代码包含它:

def train(self, dataloader, max_batches=500, verbose=True, **kwargs):
    with tqdm(total=max_batches, disable=not verbose, **kwargs) as pbar:
        for results in self.train_iter(dataloader, max_batches=max_batches):
            pbar.update(1)
            postfix = 'loss': '0:.4f'.format(results['mean_outer_loss'])

            if 'accuracies_after' in results:
                postfix['accuracy'] = '0:.4f'.format(
                    np.mean(results['accuracies_after']))
            pbar.set_postfix(**postfix)
    # for logging
    return results
    是嵌套函数调用导致进度条不显示的原因吗?

调用顺序为experiment.py > train.py > nested.py

train.py 通过以下方式调用nested.py 中的train 函数:

对于范围内的纪元(args.num_epochs):

results_metatraining = metalearner.train(meta_train_dataloader,
                  max_batches=args.num_batches,
                  verbose=args.verbose,
                  desc='Training',
                  # leave=False
                  leave=True
                  ) 

尝试了替代方案但没有成功:

    ### try2
    process = subprocess.Popen(command_list, shell=False, encoding='utf-8',
                               stdin=DEVNULL, stdout=subprocess.PIPE)
    while True:
        output = process.stdout.readline().strip()
        print('output: ' + output)
        if output == '' and process.poll() is not None:  # end of output
            break
        if output: # print output in realtime
            print(output)
    else:
        output = process.communicate()
    process.wait()


    ### try6
    process = subprocess.Popen(command_list, shell=False,
                               stdout=subprocess.PIPE, universal_newlines=True)
    for stdout_line in iter(process.stdout.readline, ""):
        yield stdout_line 
    process.stdout.close()
    return_code = process.wait()
    print('return_code' + str(return_code))
    if return_code:
        raise subprocess.CalledProcessError(return_code, command_list)


    ### try7
    with subprocess.Popen(command_list, stdout=subprocess.PIPE, 
                          bufsize=1, universal_newlines=True) as p:
        while True:
            line = p.stdout.readline()
            if not line:
                break
            print(line)    
        exit_code = p.poll()

【问题讨论】:

【参考方案1】:

我认为 readline 正在等待'\n',而 tqdm 没有创建新行,也许这会有所帮助(我没有尝试):

import io
def execute_experiment(command_list):
    tic = time.time()
    try:
        process = subprocess.Popen(
            command_list, shell=False, 
            encoding='utf-8',
            bufsize=1,
            stdin=subprocess.DEVNULL,
            universal_newlines=True,
            stdout=subprocess.PIPE, 
            stderr=subprocess.STDOUT
            )
        # Poll process for new output until finished
        # Source: https://***.com/q/37401654/7769076
        reader = io.TextIOWrapper(process.stdout, encoding='utf8')
        while process.poll() is None:
            char = reader.read(1)
            sys.stdout.write(char)
            sys.stdout.flush()

    except CalledProcessError as err:
        print("CalledProcessError: 0".format(err))
        sys.exit(1)

    except OSError as err:
        print("OS error: 0".format(err))
        sys.exit(1)

    except:
        print("Unexpected error:", sys.exc_info()[0])
        raise

    if (process.returncode == 0):
        toc = time.time()
        time1 = str(round(toc - tic))
        return time1
    else:
        return 1

【讨论】:

以上是关于从子进程调用的外部 python 脚本打印 tqdm 进度条的主要内容,如果未能解决你的问题,请参考以下文章

通过 npm 脚本生成时从子进程向父进程发送消息

如何将值从子脚本传递给同时运行的父脚本?

从子进程中实时捕获标准输出

如何在python脚本中获取exe的输出?

Python - 调用 perl 作为子进程 - 等待完成其后台进程并打印到 shell

Python调用(运行)外部程序