如何将 numpy 数组转换为 mp3 文件

Posted 2023-02-25

技术标签:

【中文标题】如何将 numpy 数组转换为 mp3 文件【英文标题】：How to convert a numpy array to a mp3 file 【发布时间】：2021-02-14 00:59:25 【问题描述】：

我正在使用声卡库来记录我的麦克风输入，它记录在 NumPy 数组中，我想抓取该音频并将其保存为 mp3 文件。

代码：

import soundcard as sc
import numpy 
import threading


speakers = sc.all_speakers() # Gets a list of the systems speakers
default_speaker = sc.default_speaker() # Gets the default speaker
mics = sc.all_microphones() # Gets a list of all the microphones


default_mic = sc.get_microphone('Headset Microphone (Arctis 7 Chat)') # Gets the default microphone


# Records the default microphone
def record_mic():
  print('Recording...')
  with default_mic.recorder(samplerate=48000) as mic, default_speaker.player(samplerate=48000) as sp:
      for _ in range(1000000000000):
          data = mic.record(numframes=None) # 'None' creates zero latency
          sp.play(data) 
          
          # Save the mp3 file here 


recordThread = threading.Thread(target=record_mic)
recordThread.start()

【问题讨论】：

【参考方案1】：

使用 Scipy（到 wav 文件）

您可以轻松转换为 wav，然后单独将 wav 转换为 mp3。更多详情here.

from scipy.io.wavfile import write

samplerate = 44100; fs = 100
t = np.linspace(0., 1., samplerate)

amplitude = np.iinfo(np.int16).max
data = amplitude * np.sin(2. * np.pi * fs * t)

write("example.wav", samplerate, data.astype(np.int16))

用pydub（转mp3）

从这个优秀的thread试试这个功能-

import pydub 
import numpy as np

def write(f, sr, x, normalized=False):
    """numpy array to MP3"""
    channels = 2 if (x.ndim == 2 and x.shape[1] == 2) else 1
    if normalized:  # normalized array - each item should be a float in [-1, 1)
        y = np.int16(x * 2 ** 15)
    else:
        y = np.int16(x)
    song = pydub.Audiosegment(y.tobytes(), frame_rate=sr, sample_width=2, channels=channels)
    song.export(f, format="mp3", bitrate="320k")

#[[-225  707]
# [-234  782]
# [-205  755]
# ..., 
# [ 303   89]
# [ 337   69]
# [ 274   89]]

write('out2.mp3', sr, x)

注意：输出 MP3 将是 16 位的，因为 MP3 始终是 16 位的。但是，您可以按照@Arty 的建议将sample_width=3 设置为24 位输入。

【讨论】：

我认为 sample_width=2 控制它是 16 位数据。可能要支持 24 位，您只需更改为 sample_width=3。 24 位在 WAV 中很常见，但我不确定 mp3 是否如此。据我所知他们有16位。但我可能错了。是的 MP3 是 16 位的，但我认为用短语 Note: It only works for 16-bit files 表示您的代码不支持 24 位 INPUT 数据，因为具有 sample_width=2 的代码行与输入有关样本，不输出 MP3。因此，要支持输入 24 位样本，只需执行 sample_width=3，输出 MP3 将是 16 位的，因为 MP3 始终是 16 位的。啊，好吧，我的坏。我更新它以减少混乱。运行代码时出现错误“发生异常：未定义 NameError name 'sr'”

以上是关于如何将 numpy 数组转换为 mp3 文件的主要内容，如果未能解决你的问题，请参考以下文章