如何使用带有字节而不是文件的python子进程

Posted 2023-02-23

技术标签:

【中文标题】如何使用带有字节而不是文件的python子进程【英文标题】：How to use python subprocess with bytes instead of files 【发布时间】：2020-08-19 11:43:18 【问题描述】：

我可以使用 ffmpeg 将 mp4 转换为 wav，这样做：

ffmpeg -vn test.wav  -i test.mp4

我也可以使用subprocess 来做同样的事情，只要我的输入和输出是文件路径。

但是，如果我想直接在字节上使用ffmpeg 或像io.BytesIO() 这样的“类文件”对象怎么办？

这是一个尝试：

import subprocess
from io import BytesIO
b = BytesIO()

with open('test.mp4', 'rb') as stream:
    command = ['ffmpeg', '-i']
    proc = subprocess.Popen(command, stdin=subprocess.PIPE, stdout=b)
    proc.communicate(input=stream.read())
    proc.wait()
    proc.stdin.close()
    proc.stdout.close()

给我：

---------------------------------------------------------------------------
UnsupportedOperation                      Traceback (most recent call last)
<ipython-input-84-0ddce839ebc9> in <module>
      5 with open('test.mp4', 'rb') as stream:
      6     command = ['ffmpeg', '-i']
----> 7     proc = subprocess.Popen(command, stdin=subprocess.PIPE, stdout=b)
...
   1486                 # Assuming file-like object
-> 1487                 c2pwrite = stdout.fileno()
   1488 
   1489             if stderr is None:

UnsupportedOperation: fileno

当然，我可以使用临时文件来汇集我的字节，但我希望能够避免写入磁盘（因为这一步只是转换管道中的一个链接）。

【问题讨论】：

***.com/questions/31080829/… 值得一看显然BytesIO 对象没有fileno() 方法——这并不奇怪，真的。 @DaveIdito：我尝试了一个单独的脚本，但得到了同样的错误。 @MisterMiyagi：你已经解决了我的元问题（但具体问题似乎更有机会在这里得到回答）。你说“实际文件，包括文件路径、stat、xattrs 等”。我的元是：“如何获得与 python 中的实际文件等效的内存中的文件？”。 BytesIO 并没有一路走好。那么如何才能走得更远呢？您所描述的是又名“ram-disk”。 Python 没有，更不用说外部进程也可以使用的了。 【参考方案1】：

这是我最近提出的解决方案，尽管我使用 AWS 和 GCP 存储桶对象作为输入和输出。无论如何，我都不是 python 专家，但这让我得到了我想要的结果。

您需要在本地机器上安装 ffmpeg 并将其添加到环境变量中才能访问 ffmpeg。

如果您使用云，则 ffmpeg 预装在谷歌云函数中，并且您可以利用 AWS 的存储库库中的 Lambda 层。

希望有人能从中受益。 :)

import subprocess

# tested against 'wav', 'mp3', 'flac', 'mp4'
desired_output = 'mp3'
track_input = 'C:\\Users\\.....\\track.wav'
track_output = f'C:\\Users\\......\\output_track.desired_output'

encoded_type = ''
format_for_conversion = desired_output 

if desired_output =='m4a':
    encoded_type= '-c:a aac'
    format_for_conversion = 'adts'

with open(track_input, "rb") as in_track_file:
    data = in_track_file.read()

input_track_data= bytearray(data)

# using pipe:0 refers to the stdin, pipe:1 refers to stdout
ffmpeg_command = f'ffmpeg  -i pipe:0 encoded_type -f format_for_conversion pipe:1 '

ffmpeg_process = subprocess.Popen(ffmpeg_command, stdin=subprocess.PIPE, stdout=subprocess.PIPE)

output_stream = ffmpeg_process.communicate(input_track_data)
# comes back as a tuple
output_bytes = output_stream[0]

with open(track_output, 'ab') as f:
    delete_content(f)
    f.write(output_bytes)

【讨论】：

感谢@sobblesbobbles 提供此解决方案。您的解决方案是否适用于视频？我想下载一个 ts 文件，转换为 h265 mp4，然后推送到 AWS，而不写入磁盘。当我用你的方法尝试这个时，似乎只有音轨通过。我的命令是ffmpeg -i pipe:0 -c:v libx265 -c:a copy -f avi pipe:1。旁注，您的 delete_content() 函数尚未在您的示例中定义（无论如何似乎都不需要）【参考方案2】：

基于@thorwhalen 的回答，这是从字节到字节的工作方式。您可能缺少@thorwhalen，是与进程交互时发送和获取数据的实际管道到管道方式。发送字节时，应该在进程读取它之前关闭标准输入。

def from_bytes_to_bytes(
        input_bytes: bytes,
        action: str = "-f wav -acodec pcm_s16le -ac 1 -ar 44100")-> bytes or None:
    command = f"ffmpeg -y -i /dev/stdin -f nut action -"
    ffmpeg_cmd = subprocess.Popen(
        shlex.split(command),
        stdin=subprocess.PIPE,
        stdout=subprocess.PIPE,
        shell=False
    )
    b = b''
    # write bytes to processe's stdin and close the pipe to pass
    # data to piped process
    ffmpeg_cmd.stdin.write(input_bytes)
    ffmpeg_cmd.stdin.close()
    while True:
        output = ffmpeg_cmd.stdout.read()
        if len(output) > 0:
            b += output
        else:
            error_msg = ffmpeg_cmd.poll()
            if error_msg is not None:
                break
    return b

【讨论】：

【参考方案3】：

这里是部分答案：三个函数展示了如何从文件到文件（为了完整性）、从字节到文件以及从文件到字节。字节到字节的解决方案正在反击。

import shlex
import subprocess

def from_file_to_file(input_file: str, output_file: str, action="-f wav -acodec pcm_s16le -ac 1 -ar 44100"):
    command = f"ffmpeg -i input_file action -vn output_file"
    subprocess.call(shlex.split(command))


def from_file_to_bytes(input_file: str, action="-f wav -acodec pcm_s16le -ac 1 -ar 44100"):
    command = f"ffmpeg -i input_file action -"

    ffmpeg_cmd = subprocess.Popen(
        shlex.split(command),
        stdout=subprocess.PIPE,
        shell=False
    )
    b = b''
    while True:
        output = ffmpeg_cmd.stdout.read()
        if len(output) > 0:
            b += output
        else:
            error_msg = ffmpeg_cmd.poll()
            if error_msg is not None:
                break
    return b


def from_bytes_to_file(input_bytes, output_file, action="-f wav -acodec pcm_s16le -ac 1"):
    command = f"ffmpeg -i /dev/stdin action -vn output_file"
    ffmpeg_cmd = subprocess.Popen(
        shlex.split(command),
        stdin=subprocess.PIPE,
        stdout=subprocess.PIPE,
        shell=False
    )
    ffmpeg_cmd.communicate(input_bytes)

【讨论】：

以上是关于如何使用带有字节而不是文件的python子进程的主要内容，如果未能解决你的问题，请参考以下文章