Python3 在 Linux 中调用 Python2 多处理的行为与在 Windows 中不同

Posted

技术标签:

【中文标题】Python3 在 Linux 中调用 Python2 多处理的行为与在 Windows 中不同【英文标题】:Python3 calling Python2 multiprocessing in Linux acts differently than in Windows 【发布时间】:2021-01-11 03:18:46 【问题描述】:

我有一个在Python3.8 中运行的简单示例代码,它打开一个在Python2.7 中执行的subprocess(利用多处理)。

Windows 10 中,我的代码行为就是我的意图。 Python2 池在哪里运行并相应地打印到stdout。并且main.py 几乎立即读取标准输出,就像池在其上写入一样。

不幸的是,我在 Linux (Ubuntu 20.04.1 LTS) 上看到了不同的结果。看来,在 Linux 中,在整个池完成之前我不会得到任何回报。

我怎样才能使代码在 Linux 中也一样工作?

请查看下面的简单示例代码和我得到的输出。

main.py

import subprocess
import datetime
import tempfile
import os

def get_time():
    return datetime.datetime.now()

class ProcReader():
    def __init__(self, python_file, temp=None, wait=False):
        self.proc = subprocess.Popen(['python2', python_file], stdout=subprocess.PIPE)

    def __iter__(self):
        return self

    def __next__(self):
        while True:
            line = self.proc.stdout.readline()
            if not line:
                raise StopIteration
            return line

if __name__ == "__main__":
    r1 = ProcReader("p2.py")

    for l1 in r1:
        print("Main reading at:  for ".format(get_time(), l1))

p2.py

import time
import multiprocessing as mp
from multiprocessing import freeze_support
import datetime

def get_time():
    return datetime.datetime.now()

def f1(name):
    for x in range(2):
        time.sleep(1)
        print(" Job#:  from f1".format(get_time(), name))

def f2(name):
    for x in range(2):
        time.sleep(2)
        print(" Job#:  from f2".format(get_time(), name))

if __name__ == '__main__':
    freeze_support()

    pool = mp.Pool(2)
    tasks = ["1", "2", "3", "4", "5", "6", "7"]
    for i, task in enumerate(tasks):
        if i%2:
            pool.apply_async(f2, args=(task,))
        else:
            pool.apply_async(f1, args=(task,))

    pool.close()
    pool.join()

Windows 输出:

Main reading at: 2020-09-24 15:28:19.044626 for b'2020-09-24 15:28:19.044000 Job#: 1 from f1\n'
Main reading at: 2020-09-24 15:28:20.045454 for b'2020-09-24 15:28:20.045000 Job#: 1 from f1\n'
Main reading at: 2020-09-24 15:28:20.046711 for b'2020-09-24 15:28:20.046000 Job#: 2 from f2\n'
Main reading at: 2020-09-24 15:28:21.045510 for b'2020-09-24 15:28:21.045000 Job#: 3 from f1\n'
Main reading at: 2020-09-24 15:28:22.046334 for b'2020-09-24 15:28:22.046000 Job#: 3 from f1\n'
Main reading at: 2020-09-24 15:28:22.047368 for b'2020-09-24 15:28:22.047000 Job#: 2 from f2\n'
Main reading at: 2020-09-24 15:28:23.047519 for b'2020-09-24 15:28:23.047000 Job#: 5 from f1\n'
Main reading at: 2020-09-24 15:28:24.046356 for b'2020-09-24 15:28:24.046000 Job#: 4 from f2\n'
Main reading at: 2020-09-24 15:28:24.048356 for b'2020-09-24 15:28:24.048000 Job#: 5 from f1\n'
Main reading at: 2020-09-24 15:28:26.047307 for b'2020-09-24 15:28:26.047000 Job#: 4 from f2\n'
Main reading at: 2020-09-24 15:28:26.049168 for b'2020-09-24 15:28:26.049000 Job#: 6 from f2\n'
Main reading at: 2020-09-24 15:28:27.047897 for b'2020-09-24 15:28:27.047000 Job#: 7 from f1\n'
Main reading at: 2020-09-24 15:28:28.048337 for b'2020-09-24 15:28:28.048000 Job#: 7 from f1\n'
Main reading at: 2020-09-24 15:28:28.049367 for b'2020-09-24 15:28:28.049000 Job#: 6 from f2\n'

Linux 的输出:

Main reading at: 2020-09-24 19:28:45.972346 for b'2020-09-24 19:28:36.932473 Job#: 1 from f1\n'
Main reading at: 2020-09-24 19:28:45.972559 for b'2020-09-24 19:28:37.933594 Job#: 1 from f1\n'
Main reading at: 2020-09-24 19:28:45.972585 for b'2020-09-24 19:28:38.935255 Job#: 3 from f1\n'
Main reading at: 2020-09-24 19:28:45.972597 for b'2020-09-24 19:28:39.936297 Job#: 3 from f1\n'
Main reading at: 2020-09-24 19:28:45.972685 for b'2020-09-24 19:28:40.937666 Job#: 5 from f1\n'
Main reading at: 2020-09-24 19:28:45.972711 for b'2020-09-24 19:28:41.938629 Job#: 5 from f1\n'
Main reading at: 2020-09-24 19:28:45.972724 for b'2020-09-24 19:28:43.941109 Job#: 6 from f2\n'
Main reading at: 2020-09-24 19:28:45.972735 for b'2020-09-24 19:28:45.943310 Job#: 6 from f2\n'
Main reading at: 2020-09-24 19:28:45.973115 for b'2020-09-24 19:28:37.933317 Job#: 2 from f2\n'
Main reading at: 2020-09-24 19:28:45.973139 for b'2020-09-24 19:28:39.935938 Job#: 2 from f2\n'
Main reading at: 2020-09-24 19:28:45.973149 for b'2020-09-24 19:28:41.938587 Job#: 4 from f2\n'
Main reading at: 2020-09-24 19:28:45.973157 for b'2020-09-24 19:28:43.941109 Job#: 4 from f2\n'
Main reading at: 2020-09-24 19:28:45.973165 for b'2020-09-24 19:28:44.942306 Job#: 7 from f1\n'
Main reading at: 2020-09-24 19:28:45.973173 for b'2020-09-24 19:28:45.943503 Job#: 7 from f1\n'

请忽略时间,因为时钟不同,但正如您所见,在 Windows 中,main.py 在写入 python2 池后立即获取它,但对于linuxmain.py 中的所有内容仅在以下情况下写入所有的工作都完成了。我不太关心作业完成的顺序,我真的只是希望main.py 在 Python2 池中写入后尽快获得stdout

【问题讨论】:

【参考方案1】:

Linux 上的stdout 被缓冲,多处理上的print() 未被刷新,因为进程不控制终端。

sys.stdout 的猴子补丁在这里很有用

import sys,os
unbuffered = os.fdopen(sys.stdout.fileno(), 'w', 0)
sys.stdout = unbuffered

或者您可能必须在每个print() 之后调用sys.stdout.flush()

【讨论】:

以上是关于Python3 在 Linux 中调用 Python2 多处理的行为与在 Windows 中不同的主要内容,如果未能解决你的问题,请参考以下文章

Python3 在 Linux 中调用 Python2 多处理的行为与在 Windows 中不同

将 python3 包添加到 Buildroot

记一次在CentOS系统搭建python3环境

pytho 的基本语法

python ,基础 (小白一个)

一篇文章助你理解Python3中字符串编码问题