从音调中去除不需要的频率
Posted
技术标签:
【中文标题】从音调中去除不需要的频率【英文标题】:Remove unwanted frequencies from tone 【发布时间】:2021-03-29 12:24:16 【问题描述】:我正在尝试生成具有 2350 Hz 恒定音调的“哔”声。我正在使用下面的代码(我得到 here)来生成一个持续时间为 0.5 秒的具有此音调的 WAV 文件。
import math
import wave
import struct
# Audio will contain a long list of samples (i.e. floating point numbers describing the
# waveform). If you were working with a very long sound you'd want to stream this to
# disk instead of buffering it all in memory list this. But most sounds will fit in
# memory.
audio = []
sample_rate = 44100.0
def append_silence(duration_milliseconds=500):
"""
Adding silence is easy - we add zeros to the end of our array
"""
num_samples = duration_milliseconds * (sample_rate / 1000.0)
for x in range(int(num_samples)):
audio.append(0.0)
return
def append_sinewave(
freq=440.0,
duration_milliseconds=500,
volume=1.0):
"""
The sine wave generated here is the standard beep. If you want something
more aggresive you could try a square or saw tooth waveform. Though there
are some rather complicated issues with making high quality square and
sawtooth waves... which we won't address here :)
"""
global audio # using global variables isn't cool.
num_samples = duration_milliseconds * (sample_rate / 1000.0)
for x in range(int(num_samples)):
audio.append(volume * math.sin(2 * math.pi * freq * ( x / sample_rate )))
return
def save_wav(file_name):
# Open up a wav file
wav_file=wave.open(file_name,"w")
# wav params
nchannels = 1
sampwidth = 2
# 44100 is the industry standard sample rate - CD quality. If you need to
# save on file size you can adjust it downwards. The stanard for low quality
# is 8000 or 8kHz.
nframes = len(audio)
comptype = "NONE"
compname = "not compressed"
wav_file.setparams((nchannels, sampwidth, sample_rate, nframes, comptype, compname))
# WAV files here are using short, 16 bit, signed integers for the
# sample size. So we multiply the floating point data we have by 32767, the
# maximum value for a short integer. NOTE: It is theortically possible to
# use the floating point -1.0 to 1.0 data directly in a WAV file but not
# obvious how to do that using the wave module in python.
for sample in audio:
wav_file.writeframes(struct.pack('h', int( sample * 32767.0 )))
wav_file.close()
return
append_sinewave(volume=1, freq=2350)
save_wav("output.wav")
当运行下面的代码(使用 Librosa)生成 WAV 文件的频谱图时,我看到:
频谱图:
代码:
beepData,beep_sample_rate = librosa.load(beepSoundPath, sr=44100)
D = librosa.stft(beepData)
S_db = librosa.amplitude_to_db(np.abs(D), ref=np.max)
librosa.display.specshow(S_db)
问题在于频谱图开头和结尾处的额外频率。我怎样才能摆脱这些不需要的频率?
【问题讨论】:
在音频编辑程序中打开波形是否正确?我很确定这只是 FFT 工作原理的产物,因为数据是有限的。 有没有办法通过调整频率来抵消这种影响?当我使用_ = librosa.display.waveplot(beepData,sr=beep_sample_rate)
显示波形时,它看起来正确。
除非您有无限长的信号,否则 FFT 永远不会在输出端给出奇异频率。样本长度越小,您的频段就越“宽”。关于为什么结果反映在 Fs/2 上的原因,请参阅这个问题:dsp.stackexchange.com/questions/4825/why-is-the-fft-mirrored/…
【参考方案1】:
这些是 STFT / FFT 过程的伪影,因为在窗口的开始/结束处存在不连续性。您可以尝试使用librosa.stft(..., center=False)
,它应该消除一开始的那个。然后您可能还需要修剪/忽略最后的输出段。 n_fft
参数的至少一半。
【讨论】:
以上是关于从音调中去除不需要的频率的主要内容,如果未能解决你的问题,请参考以下文章