使用 Python 3 捕获 192 kHz 音频

Posted 2023-03-14

技术标签:

【中文标题】使用 Python 3 捕获 192 kHz 音频【英文标题】：Capture 192 kHz audio using Python 3 【发布时间】：2013-11-10 14:27:03 【问题描述】：

我需要使用 Python 3 为一些生物声学实验捕获 192 kHz 音频。我有硬件、Sound Devices USBPre 2 声卡、具有良好频率响应曲线的麦克风（最高可达 100 kHz），并且我已启用我的操作系统（ubuntu 13.04）以 192 kHz 的频率从这张卡中采样。

我尝试过使用 PyAudio 进行录音。它似乎可以工作，并且会给我一个采样率为 192 kHz 的 wav 文件。但是，当我查看频谱时，没有高于 24 kHz 的功率，这表明 PyAudio 并没有真正在 192 kHz 处捕获，而是在 48 kHz 处捕获。然而，当我使用 Audacity 和来自 JACK 的输入进行录音时，我得到了一个不错的录音，其功率高达 96kHz。所以，我的印象是 PyAudio 实际上并没有对 192 kHz 的声音进行采样，即使它应该能够采样。如何解决这个问题？

我启动 JACK 没有错误：

/usr/bin/jackd -R -dalsa -Chw:1,0 -n3 -o1 -p2048 -r192000

jackd 0.122.0
Copyright 2001-2009 Paul Davis, Stephane Letz, Jack O'Quinn, Torben Hohn and others.
jackd comes with ABSOLUTELY NO WARRANTY
This is free software, and you are welcome to redistribute it
under certain conditions; see the file COPYING for details

JACK compiled with System V SHM support.
loading driver ..
apparent rate = 192000
creating alsa driver ... -|hw:1,0|2048|3|192000|0|1|nomon|swmeter|-|32bit
control device hw:0
configuring for 192000Hz, period = 2048 frames (10.7 ms), buffer = 3 periods
ALSA: final selected sample format for capture: 24bit little-endian
ALSA: use 3 periods for capture

初始化 PyAudio（没有任何实际错误（据我所知））：

p = pyaudio.PyAudio()
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
bt_audio_service_open: connect() failed: Connection refused (111)
bt_audio_service_open: connect() failed: Connection refused (111)
bt_audio_service_open: connect() failed: Connection refused (111)
bt_audio_service_open: connect() failed: Connection refused (111)
ALSA lib pcm_dmix.c:957:(snd_pcm_dmix_open) The dmix plugin supports only playback stream

打开一个 PyAudio 流：

stream = p.open(format=pyaudio.paInt32,
                channels=1,rate=192000,
                input=True,
                frames_per_buffer=2048)

我有频谱图的图像，以防有人想验证我的解释，即 PyAudio 无法在 192 kHz 捕获（但 Audacity 可以）：

使用 PyAudio 捕获的声音频谱图

使用 Audacity 捕获的声音频谱图

如何使用 PyAudio 以 192 000 个样本/秒的速度录制声音？也欢迎提出使用 Python 3 捕获声音的其他方法的建议。

【问题讨论】：

很遗憾我从来没有遇到过这样的问题，目前也没有设备做任何测试。但我会尝试的一件事是 check if the sample rate is considered "supported" by PyAudio 为您正在使用的录音设备。支持似乎有些问题。 p.get_device_info_by_index(2) 告诉我 defaultSampleRate 是 44100，maxInputChannels 是 0。当我尝试 p.is_format_supported 时它返回一个错误（但不是当我检查内置声卡时）。 @user2936487 从您的代码中看起来您在打开流时没有指定设备索引。我会尝试p.open(..., input_device_index=dev_idx)，你确保你使用了正确的，或者循环它们来尝试它们。收集设备信息也是如此。当我使用p.open(..., input_device_index=dev_idx) 打开时，我收到以下错误OSError: [Errno Invalid number of channels] -9998，类似于我在尝试p.is_format_supported(...) 时收到的错误：ValueError: ('Invalid number of channels', -9998)。我刚刚在尝试让 PyAudio 在 OS X 上工作时也得到了这一点。那，还有[Errno Input overflowed] -9981。我设法通过遍历各种采样率、通道数和设备索引来探索正确的配置（刚刚让它工作）。我会试着把代码美化一下，然后想出一个“答案”，也许它会对你有所帮助。 【参考方案1】：

这并不是一个决定性的答案，而是试图帮助您自己追查问题。

当尝试在 OS X 上使用 PyAudio 重现您的问题时，我总是遇到[Errno Input overflowed] -9981（比如several 其他people，它seems）。 p.is_format_supported() 报告为 OK 的配置也会导致这些错误。因此，我编写了一个脚本，它只是尝试使用录制设置的所有可能排列进行录制。

此脚本以防御方式，并将结果保存到根据录制设置命名的文件中。

import os
import pyaudio
import sys

# === These parameters will be permuted ===========
DEVICES = [0, 1, 2]
RATES = [44100, 48000, 192000]
FORMATS = ['Float32', 'Int32', 'Int24', 'Int16', 'Int8', 'UInt8']
CHANNELS = [1, 2]
# =================================================

CHUNK = 1024
COLUMNS = (('filename', 30),
           ('result', 9),
           ('dev', 5),
           ('rate', 8),
           ('format', 9),
           ('channels', 10),
           ('chunk', 7),
           ('reason', 0))
STATUS_MSG = "Recording... "

pa = pyaudio.PyAudio()


def get_format(format):
    fmt = getattr(pyaudio, 'pa%s' % format)
    return fmt


def record(filename=None,
           duration=5,
           dev=0,
           rate=44100,
           format='Float32',
           channels=2,
           chunk=1024,):
    """Record `duration` seconds of audio from the device with index `dev`.
    Store the result in a file named according to recording settings.
    """
    if filename is None:
        filename = "devdev-rate-format-channelsch.raw".format(**locals())
    result = 'FAILURE'
    reason = ''

    outfile = open(filename, 'w')
    print STATUS_MSG,
    sys.stdout.flush()

    try:
        stream = pa.open(input_device_index=dev,
                         rate=rate,
                         format=get_format(format),
                         channels=channels,
                         frames_per_buffer=chunk,
                         input=True,
                         )

        try:
            for i in range(0, rate / (chunk) * duration):
                a = stream.read(chunk)
                outfile.write(a)
            result = 'SUCCESS'
        # Catch exceptions when trying to read from stream
        except Exception, e:
            reason = "'%s'" % e
    # Catch exceptions when trying to even open the stream
    except Exception, e:
        reason = "'%s'" % e

    outfile.close()

    # Don't leave files behind for unsuccessful attempts
    if result == 'FAILURE':
        os.remove(filename)
        filename = ''

    info = 
    for col_name, width in COLUMNS:
        info[col_name] = str(locals()[col_name]).ljust(width)

    msg = "filenameresultdevrateformatchannelschunkreason"
    print msg.format(**info)

def main():
    # Build the header line
    header = 'STATUS'.ljust(len(STATUS_MSG) + 1)
    for col_name, width in COLUMNS:
        header += col_name.upper().ljust(width)
    print header
    print "=" * len(header)

    # Record samples for all permutations of our parameter lists
    for dev in DEVICES:
        for rate in RATES:
            for format in FORMATS:
                for channels in CHANNELS:
                    record(duration=2,
                           dev=dev,
                           rate=rate,
                           format=format,
                           channels=channels,
                           chunk=CHUNK)

if __name__ == '__main__':
    main()

示例输出（简化）：

STATUS        FILENAME                      RESULT   DEV  RATE    FORMAT   CHANNELS  CHUNK  REASON
==================================================================================================
Recording...  dev0-44100-Float32-1ch.raw    SUCCESS  0    44100   Float32  1         1024
Recording...  dev0-44100-Float32-2ch.raw    SUCCESS  0    44100   Float32  2         1024
Recording...  dev0-44100-Int16-1ch.raw      SUCCESS  0    44100   Int16    1         1024
Recording...  dev0-44100-Int16-2ch.raw      SUCCESS  0    44100   Int16    2         1024
Recording...                                FAILURE  0    192000  Float32  1         1024   '[Errno Input overflowed] -9981'
Recording...                                FAILURE  0    192000  Float32  2         1024   '[Errno Input overflowed] -9981'
Recording...                                FAILURE  0    192000  Int16    1         1024   '[Errno Input overflowed] -9981'
Recording...                                FAILURE  0    192000  Int16    2         1024   '[Errno Input overflowed] -9981'
Recording...  dev1-44100-Float32-1ch.raw    SUCCESS  1    44100   Float32  1         1024
Recording...  dev1-44100-Float32-2ch.raw    SUCCESS  1    44100   Float32  2         1024
Recording...  dev1-44100-Int16-1ch.raw      SUCCESS  1    44100   Int16    1         1024
Recording...  dev1-44100-Int16-2ch.raw      SUCCESS  1    44100   Int16    2         1024
Recording...                                FAILURE  1    192000  Float32  1         1024   '[Errno Input overflowed] -9981'
Recording...                                FAILURE  1    192000  Float32  2         1024   '[Errno Input overflowed] -9981'
Recording...                                FAILURE  1    192000  Int16    1         1024   '[Errno Input overflowed] -9981'
Recording...                                FAILURE  1    192000  Int16    2         1024   '[Errno Input overflowed] -9981'
Recording...                                FAILURE  2    44100   Float32  1         1024   '[Errno Invalid number of channels] -9998'
Recording...                                FAILURE  2    44100   Float32  2         1024   '[Errno Invalid number of channels] -9998'
Recording...                                FAILURE  2    44100   Int16    1         1024   '[Errno Invalid number of channels] -9998'
Recording...                                FAILURE  2    44100   Int16    2         1024   '[Errno Invalid number of channels] -9998'
Recording...                                FAILURE  2    192000  Float32  1         1024   '[Errno Invalid number of channels] -9998'
Recording...                                FAILURE  2    192000  Float32  2         1024   '[Errno Invalid number of channels] -9998'
Recording...                                FAILURE  2    192000  Int16    1         1024   '[Errno Invalid number of channels] -9998'
Recording...                                FAILURE  2    192000  Int16    2         1024   '[Errno Invalid number of channels] -9998'

【讨论】：

我试过你的脚本有一些小的修改（添加了 11 到设备）。 pa.get_device_info_by_index(11) 今天看起来更好（部分输出）：... 'defaultSampleRate': 192000.0, 'hostApi': 2, 'index': 11, 'maxInputChannels': 2, 'maxOutputChannels': 1,...。 pa.get_host_api_info_by_index(2) 给出了这个：

'defaultInputDevice': 11, 'defaultOutputDevice': 11, 'deviceCount': 1, 'index': 2, 'name': 'JACK Audio Connection Kit', 'structVersion': 1, 'type': 12

。我什至可以打开一个流：stream = pa.open(input_device_index=11,rate=192000,format=4,channels=1,frames_per_buffer=2048,input=True)，偶尔读取一个块：stream.read(1024)，但大多数时候它会崩溃 python：python3: malloc.c:2369: sysmalloc: ... Aborted (core dumped) 为了打开一个流，看来我只需要找到正确的input_device_index（可能会有所不同），然后将参数值与启动JACK时返回的值匹配（速率= 192000，格式= paInt24 ）。但是，阅读一大块所带来的崩溃让我放弃了。我可能会改用 Audacity 录制。非常感谢 Lukas 的所有帮助。 @user2936487 不客气，感谢您的反馈！

以上是关于使用 Python 3 捕获 192 kHz 音频的主要内容，如果未能解决你的问题，请参考以下文章

WASAPI 在 Windows 上捕获的缓冲区大小

OPPO第二颗自研芯片来了，首次实现192kHz/24bit无损音频蓝牙传输，台积电6nm工艺...

从 WasapiLoopbackCapture 捕获音频，并转换为 muLaw

在 python 中将 Goertzel 算法扩展到 24 kHz、32 kHz 和 48 kHz

使用 C# 程序从 PC 麦克风获取音频

音视频开发为什么无损音频会有44.1Khz这样的奇葩采样率？