FFT算法得到错误的声音频率值
Posted
技术标签:
【中文标题】FFT算法得到错误的声音频率值【英文标题】:FFT algorithm getting wrong sound frequency value 【发布时间】:2012-11-15 04:05:41 【问题描述】:我在 440Hz audio file 上运行了这个 FFT algorithm。但我得到了一个意想不到的声音频率:510Hz。
-
包含 .wav 的
byteArray
是否正确转换为 2 个双精度数组(Re & Im 部分)?虚数数组只包含 0。
我假设最高声音频率是xRe
数组的最大值:请看run()
函数的最末尾?也许这是我的错误:它是平均水平还是类似的东西?
那有什么问题呢?
更新: Re+Im 的最大总和在 index = 0 处,所以我得到频率 = 0;
Whole project:包含 .wav -> 打开即可运行。
using System;
using System.Net;
using System.IO;
namespace FFT
/**
* Performs an in-place complex FFT.
*
* Released under the MIT License
*
* Copyright (c) 2010 Gerald T. Beauregard
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to
* deal in the Software without restriction, including without limitation the
* rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
* sell copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
public class FFT2
// Element for linked list in which we store the
// input/output data. We use a linked list because
// for sequential access it's faster than array index.
class FFTElement
public double re = 0.0; // Real component
public double im = 0.0; // Imaginary component
public FFTElement next; // Next element in linked list
public uint revTgt; // Target position post bit-reversal
private static int sampleRate;
private uint m_logN = 0; // log2 of FFT size
private uint m_N = 0; // FFT size
private FFTElement[] m_X; // Vector of linked list elements
/**
*
*/
public FFT2()
/**
* Initialize class to perform FFT of specified size.
*
* @param logN Log2 of FFT length. e.g. for 512 pt FFT, logN = 9.
*/
public void init(uint logN)
m_logN = logN;
m_N = (uint)(1 << (int)m_logN);
// Allocate elements for linked list of complex numbers.
m_X = new FFTElement[m_N];
for (uint k = 0; k < m_N; k++)
m_X[k] = new FFTElement();
// Set up "next" pointers.
for (uint k = 0; k < m_N - 1; k++)
m_X[k].next = m_X[k + 1];
// Specify target for bit reversal re-ordering.
for (uint k = 0; k < m_N; k++)
m_X[k].revTgt = BitReverse(k, logN);
/**
* Performs in-place complex FFT.
*
* @param xRe Real part of input/output
* @param xIm Imaginary part of input/output
* @param inverse If true, do an inverse FFT
*/
public void run(double[] xRe, double[] xIm, bool inverse = false)
uint numFlies = m_N >> 1; // Number of butterflies per sub-FFT
uint span = m_N >> 1; // Width of the butterfly
uint spacing = m_N; // Distance between start of sub-FFTs
uint wIndexStep = 1; // Increment for twiddle table index
// Copy data into linked complex number objects
// If it's an IFFT, we divide by N while we're at it
FFTElement x = m_X[0];
uint k = 0;
double scale = inverse ? 1.0 / m_N : 1.0;
while (x != null)
x.re = scale * xRe[k];
x.im = scale * xIm[k];
x = x.next;
k++;
// For each stage of the FFT
for (uint stage = 0; stage < m_logN; stage++)
// Compute a multiplier factor for the "twiddle factors".
// The twiddle factors are complex unit vectors spaced at
// regular angular intervals. The angle by which the twiddle
// factor advances depends on the FFT stage. In many FFT
// implementations the twiddle factors are cached, but because
// array lookup is relatively slow in C#, it's just
// as fast to compute them on the fly.
double wAngleInc = wIndexStep * 2.0 * Math.PI / m_N;
if (inverse == false)
wAngleInc *= -1;
double wMulRe = Math.Cos(wAngleInc);
double wMulIm = Math.Sin(wAngleInc);
for (uint start = 0; start < m_N; start += spacing)
FFTElement xTop = m_X[start];
FFTElement xBot = m_X[start + span];
double wRe = 1.0;
double wIm = 0.0;
// For each butterfly in this stage
for (uint flyCount = 0; flyCount < numFlies; ++flyCount)
// Get the top & bottom values
double xTopRe = xTop.re;
double xTopIm = xTop.im;
double xBotRe = xBot.re;
double xBotIm = xBot.im;
// Top branch of butterfly has addition
xTop.re = xTopRe + xBotRe;
xTop.im = xTopIm + xBotIm;
// Bottom branch of butterly has subtraction,
// followed by multiplication by twiddle factor
xBotRe = xTopRe - xBotRe;
xBotIm = xTopIm - xBotIm;
xBot.re = xBotRe * wRe - xBotIm * wIm;
xBot.im = xBotRe * wIm + xBotIm * wRe;
// Advance butterfly to next top & bottom positions
xTop = xTop.next;
xBot = xBot.next;
// Update the twiddle factor, via complex multiply
// by unit vector with the appropriate angle
// (wRe + j wIm) = (wRe + j wIm) x (wMulRe + j wMulIm)
double tRe = wRe;
wRe = wRe * wMulRe - wIm * wMulIm;
wIm = tRe * wMulIm + wIm * wMulRe;
numFlies >>= 1; // Divide by 2 by right shift
span >>= 1;
spacing >>= 1;
wIndexStep <<= 1; // Multiply by 2 by left shift
// The algorithm leaves the result in a scrambled order.
// Unscramble while copying values from the complex
// linked list elements back to the input/output vectors.
x = m_X[0];
while (x != null)
uint target = x.revTgt;
xRe[target] = x.re;
xIm[target] = x.im;
x = x.next;
//looking for max IS THIS IS FREQUENCY
double max = 0, index = 0;
for (int i = 0; i < xRe.Length; i++)
if (xRe[i] + xIm[i] > max)
max = xRe[i]*xRe[i] + xIm[i]*xIm[i];
index = i;
max = Math.Sqrt(max);
/* if the peak is at bin index i then the corresponding
frequency will be i * Fs / N whe Fs is the sample rate in Hz and N is the FFT size.*/
//DONT KNOW WHY THE BIGGEST VALUE IS IN THE BEGINNING
Console.WriteLine("max "+ max+" index " + index + " m_logN" + m_logN + " " + xRe[0]);
max = index * sampleRate / m_logN;
Console.WriteLine("max " + max);
/**
* Do bit reversal of specified number of places of an int
* For example, 1101 bit-reversed is 1011
*
* @param x Number to be bit-reverse.
* @param numBits Number of bits in the number.
*/
private uint BitReverse(
uint x,
uint numBits)
uint y = 0;
for (uint i = 0; i < numBits; i++)
y <<= 1;
y |= x & 0x0001;
x >>= 1;
return y;
public static void Main(String[] args)
// BinaryReader reader = new BinaryReader(System.IO.File.OpenRead(@"C:\Users\Duke\Desktop\e.wav"));
BinaryReader reader = new BinaryReader(File.Open(@"440.wav", FileMode.Open));
//Read the wave file header from the buffer.
int chunkID = reader.ReadInt32();
int fileSize = reader.ReadInt32();
int riffType = reader.ReadInt32();
int fmtID = reader.ReadInt32();
int fmtSize = reader.ReadInt32();
int fmtCode = reader.ReadInt16();
int channels = reader.ReadInt16();
sampleRate = reader.ReadInt32();
int fmtAvgBPS = reader.ReadInt32();
int fmtBlockAlign = reader.ReadInt16();
int bitDepth = reader.ReadInt16();
if (fmtSize == 18)
// Read any extra values
int fmtExtraSize = reader.ReadInt16();
reader.ReadBytes(fmtExtraSize);
int dataID = reader.ReadInt32();
int dataSize = reader.ReadInt32();
// Store the audio data of the wave file to a byte array.
byte[] byteArray = reader.ReadBytes(dataSize);
/* for (int i = 0; i < byteArray.Length; i++)
Console.Write(byteArray[i] + " ");
*/
byte[] data = byteArray;
double[] arrRe = new double[data.Length];
for (int i = 0; i < arrRe.Length; i++)
arrRe[i] = data[i] / 32768.0;
double[] arrI = new double[data.Length];
for (int i = 0; i < arrRe.Length; i++)
arrI[i] = 0;
/**
* Initialize class to perform FFT of specified size.
*
* @param logN Log2 of FFT length. e.g. for 512 pt FFT, logN = 9.
*/
Console.WriteLine();
FFT2 fft2 = new FFT2();
uint logN = (uint)Math.Log(data.Length, 2);
fft2.init(logN);
fft2.run(arrRe, arrI);
// After this you have to split that byte array for each channel (Left,Right)
// Wav supports many channels, so you have to read channel from header
Console.ReadLine();
【问题讨论】:
【参考方案1】:您需要解决一些问题:
您没有在 FFT 之前应用 window function - 这在一般情况下会导致 spectral leakage 并且您可能会得到误导性结果,尤其是在寻找峰时,因为会有“拖尾"的光谱。
在寻找峰值时,您应该查看 FFT 输出箱的幅度,而不是单个实部和虚部 - magnitude = sqrt(re^2 +im^2)
(尽管您无需担心sqrt
如果您只是在寻找峰值)。
已经确定了一个峰值,您需要将 bin 索引转换为频率 - 如果峰值位于 bin 索引 i,则相应的频率将为 i * Fs / N
,其中Fs
是以 Hz 为单位的采样率,@ 987654329@ 是 FFT 大小。
对于实数到复数 FFT,您可以忽略第二个 N / 2 个输出 bin,因为它们只是前 N / 2 个 bin 的复共轭镜像
(请参阅this answer 以获得对上述内容的更全面解释。)
【讨论】:
感谢您的帮助(因为声誉低,我无法投票。我在 run() 函数结束时进行了编辑,问题是这两个数组的最大总和是index = 0。所以我得到 index * sampleRate / m_logN = 0 * 220500/17 项目下载链接更新为。 如果您尚未在 FFT 之前添加窗口函数,那么您通常会在 bin 0 (DC) 处获得较大的值,然后在大量低频 bin 中涂抹该值。添加窗口函数将解决此问题,但请注意,如果您的模拟硬件中有任何 DC 偏移,那么这仍会在 bin 0 处为您提供(较小的)峰值。 太棒了。我将在一天内检查它,因为我在过去 24 小时内都在开发。 看看这个教程,它解释了如何解决这些问题:blog.bjornroche.com/2012/07/… user1825608,我在看到您的评论后为您投票,但我想这意味着我现在不能投票,因为我想。干杯!以上是关于FFT算法得到错误的声音频率值的主要内容,如果未能解决你的问题,请参考以下文章