从音调中去除不需要的频率

Posted

技术标签:

【中文标题】从音调中去除不需要的频率【英文标题】:Remove unwanted frequencies from tone 【发布时间】:2021-03-29 12:24:16 【问题描述】:

我正在尝试生成具有 2350 Hz 恒定音调的“哔”声。我正在使用下面的代码(我得到 here)来生成一个持续时间为 0.5 秒的具有此音调的 WAV 文件。

import math
import wave
import struct

# Audio will contain a long list of samples (i.e. floating point numbers describing the
# waveform).  If you were working with a very long sound you'd want to stream this to
# disk instead of buffering it all in memory list this.  But most sounds will fit in 
# memory.
audio = []
sample_rate = 44100.0


def append_silence(duration_milliseconds=500):
    """
    Adding silence is easy - we add zeros to the end of our array
    """
    num_samples = duration_milliseconds * (sample_rate / 1000.0)

    for x in range(int(num_samples)): 
        audio.append(0.0)

    return


def append_sinewave(
        freq=440.0, 
        duration_milliseconds=500, 
        volume=1.0):
    """
    The sine wave generated here is the standard beep.  If you want something
    more aggresive you could try a square or saw tooth waveform.   Though there
    are some rather complicated issues with making high quality square and
    sawtooth waves... which we won't address here :) 
    """ 

    global audio # using global variables isn't cool.

    num_samples = duration_milliseconds * (sample_rate / 1000.0)

    for x in range(int(num_samples)):
        audio.append(volume * math.sin(2 * math.pi * freq * ( x / sample_rate )))

    return


def save_wav(file_name):
    # Open up a wav file
    wav_file=wave.open(file_name,"w")

    # wav params
    nchannels = 1

    sampwidth = 2

    # 44100 is the industry standard sample rate - CD quality.  If you need to
    # save on file size you can adjust it downwards. The stanard for low quality
    # is 8000 or 8kHz.
    nframes = len(audio)
    comptype = "NONE"
    compname = "not compressed"
    wav_file.setparams((nchannels, sampwidth, sample_rate, nframes, comptype, compname))

    # WAV files here are using short, 16 bit, signed integers for the 
    # sample size.  So we multiply the floating point data we have by 32767, the
    # maximum value for a short integer.  NOTE: It is theortically possible to
    # use the floating point -1.0 to 1.0 data directly in a WAV file but not
    # obvious how to do that using the wave module in python.
    for sample in audio:
        wav_file.writeframes(struct.pack('h', int( sample * 32767.0 )))

    wav_file.close()

    return


append_sinewave(volume=1, freq=2350)
save_wav("output.wav")

当运行下面的代码(使用 Librosa)生成 WAV 文件的频谱图时,我看到:

频谱图:

代码:

beepData,beep_sample_rate = librosa.load(beepSoundPath, sr=44100)
D = librosa.stft(beepData)
S_db = librosa.amplitude_to_db(np.abs(D), ref=np.max)
librosa.display.specshow(S_db)

问题在于频谱图开头和结尾处的额外频率。我怎样才能摆脱这些不需要的频率?

【问题讨论】:

在音频编辑程序中打开波形是否正确?我很确定这只是 FFT 工作原理的产物,因为数据是有限的。 有没有办法通过调整频率来抵消这种影响?当我使用_ = librosa.display.waveplot(beepData,sr=beep_sample_rate) 显示波形时,它看起来正确。 除非您有无限长的信号,否则 FFT 永远不会在输出端给出奇异频率。样本长度越小,您的频段就越“宽”。关于为什么结果反映在 Fs/2 上的原因,请参阅这个问题:dsp.stackexchange.com/questions/4825/why-is-the-fft-mirrored/… 【参考方案1】:

这些是 STFT / FFT 过程的伪影,因为在窗口的开始/结束处存在不连续性。您可以尝试使用librosa.stft(..., center=False),它应该消除一开始的那个。然后您可能还需要修剪/忽略最后的输出段。 n_fft 参数的至少一半。

【讨论】:

以上是关于从音调中去除不需要的频率的主要内容,如果未能解决你的问题,请参考以下文章

python如何去除字符串中不想要的字符

python_如何去除字符串中不想要的字符?

从图像中去除高频垂直剪切噪声

使用python从频率数组中进行音调扫描

使用opencv去除噪声像素

从 DLL 中去除特定符号