FFT算法得到错误的声音频率值

Posted

技术标签:

【中文标题】FFT算法得到错误的声音频率值【英文标题】:FFT algorithm getting wrong sound frequency value 【发布时间】:2012-11-15 04:05:41 【问题描述】:

我在 440Hz audio file 上运行了这个 FFT algorithm。但我得到了一个意想不到的声音频率:510Hz。

    包含 .wav 的 byteArray 是否正确转换为 2 个双精度数组(Re & Im 部分)?虚数数组只包含 0。 我假设最高声音频率是xRe数组的最大值:请看run()函数的最末尾?也许这是我的错误:它是平均水平还是类似的东西?

那有什么问题呢?

更新: Re+Im 的最大总和在 index = 0 处,所以我得到频率 = 0;

Whole project:包含 .wav -> 打开即可运行。

using System;
using System.Net;
using System.IO;


namespace FFT 
    /**
     * Performs an in-place complex FFT.
     *
     * Released under the MIT License
     *
     * Copyright (c) 2010 Gerald T. Beauregard
     *
     * Permission is hereby granted, free of charge, to any person obtaining a copy
     * of this software and associated documentation files (the "Software"), to
     * deal in the Software without restriction, including without limitation the
     * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
     * sell copies of the Software, and to permit persons to whom the Software is
     * furnished to do so, subject to the following conditions:
     *
     * The above copyright notice and this permission notice shall be included in
     * all copies or substantial portions of the Software.
     *
     * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
     * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
     * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
     * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
     * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
     * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
     * IN THE SOFTWARE.
     */
    public class FFT2 
        // Element for linked list in which we store the
        // input/output data. We use a linked list because
        // for sequential access it's faster than array index.
        class FFTElement 
            public double re = 0.0;     // Real component
            public double im = 0.0;     // Imaginary component
            public FFTElement next;     // Next element in linked list
            public uint revTgt;         // Target position post bit-reversal
        
        private static int sampleRate;
        private uint m_logN = 0;        // log2 of FFT size
        private uint m_N = 0;           // FFT size
        private FFTElement[] m_X;       // Vector of linked list elements

        /**
         *
         */
        public FFT2() 
        

        /**
         * Initialize class to perform FFT of specified size.
         *
         * @param   logN    Log2 of FFT length. e.g. for 512 pt FFT, logN = 9.
         */
        public void init(uint logN) 
            m_logN = logN;
            m_N = (uint)(1 << (int)m_logN);

            // Allocate elements for linked list of complex numbers.
            m_X = new FFTElement[m_N];
            for (uint k = 0; k < m_N; k++)
                m_X[k] = new FFTElement();

            // Set up "next" pointers.
            for (uint k = 0; k < m_N - 1; k++)
                m_X[k].next = m_X[k + 1];

            // Specify target for bit reversal re-ordering.
            for (uint k = 0; k < m_N; k++)
                m_X[k].revTgt = BitReverse(k, logN);
        

        /**
         * Performs in-place complex FFT.
         *
         * @param   xRe     Real part of input/output
         * @param   xIm     Imaginary part of input/output
         * @param   inverse If true, do an inverse FFT
         */
        public void run(double[] xRe, double[] xIm, bool inverse = false) 
            uint numFlies = m_N >> 1;   // Number of butterflies per sub-FFT
            uint span = m_N >> 1;       // Width of the butterfly
            uint spacing = m_N;         // Distance between start of sub-FFTs
            uint wIndexStep = 1;        // Increment for twiddle table index

            // Copy data into linked complex number objects
            // If it's an IFFT, we divide by N while we're at it
            FFTElement x = m_X[0];
            uint k = 0;
            double scale = inverse ? 1.0 / m_N : 1.0;
            while (x != null) 
                x.re = scale * xRe[k];
                x.im = scale * xIm[k];
                x = x.next;
                k++;
            

            // For each stage of the FFT
            for (uint stage = 0; stage < m_logN; stage++) 
                // Compute a multiplier factor for the "twiddle factors".
                // The twiddle factors are complex unit vectors spaced at
                // regular angular intervals. The angle by which the twiddle
                // factor advances depends on the FFT stage. In many FFT
                // implementations the twiddle factors are cached, but because
                // array lookup is relatively slow in C#, it's just
                // as fast to compute them on the fly.
                double wAngleInc = wIndexStep * 2.0 * Math.PI / m_N;
                if (inverse == false)
                    wAngleInc *= -1;
                double wMulRe = Math.Cos(wAngleInc);
                double wMulIm = Math.Sin(wAngleInc);

                for (uint start = 0; start < m_N; start += spacing) 
                    FFTElement xTop = m_X[start];
                    FFTElement xBot = m_X[start + span];

                    double wRe = 1.0;
                    double wIm = 0.0;

                    // For each butterfly in this stage
                    for (uint flyCount = 0; flyCount < numFlies; ++flyCount) 
                        // Get the top & bottom values
                        double xTopRe = xTop.re;
                        double xTopIm = xTop.im;
                        double xBotRe = xBot.re;
                        double xBotIm = xBot.im;

                        // Top branch of butterfly has addition
                        xTop.re = xTopRe + xBotRe;
                        xTop.im = xTopIm + xBotIm;

                        // Bottom branch of butterly has subtraction,
                        // followed by multiplication by twiddle factor
                        xBotRe = xTopRe - xBotRe;
                        xBotIm = xTopIm - xBotIm;
                        xBot.re = xBotRe * wRe - xBotIm * wIm;
                        xBot.im = xBotRe * wIm + xBotIm * wRe;

                        // Advance butterfly to next top & bottom positions
                        xTop = xTop.next;
                        xBot = xBot.next;

                        // Update the twiddle factor, via complex multiply
                        // by unit vector with the appropriate angle
                        // (wRe + j wIm) = (wRe + j wIm) x (wMulRe + j wMulIm)
                        double tRe = wRe;
                        wRe = wRe * wMulRe - wIm * wMulIm;
                        wIm = tRe * wMulIm + wIm * wMulRe;
                    
                

                numFlies >>= 1;     // Divide by 2 by right shift
                span >>= 1;
                spacing >>= 1;
                wIndexStep <<= 1;   // Multiply by 2 by left shift
            

            // The algorithm leaves the result in a scrambled order.
            // Unscramble while copying values from the complex
            // linked list elements back to the input/output vectors.
            x = m_X[0];
            while (x != null) 
                uint target = x.revTgt;
                xRe[target] = x.re;
                xIm[target] = x.im;
                x = x.next;
            

            //looking for max  IS THIS IS FREQUENCY
            double max = 0, index = 0;
            for (int i = 0; i < xRe.Length; i++) 
                if (xRe[i] + xIm[i] > max) 
                    max = xRe[i]*xRe[i] + xIm[i]*xIm[i];
                    index = i;
                
            
            max = Math.Sqrt(max);
         /*   if the peak is at bin index i then the corresponding
            frequency will be i * Fs / N whe Fs is the sample rate in Hz and N is the FFT size.*/

            //DONT KNOW WHY THE BIGGEST VALUE IS IN THE BEGINNING
            Console.WriteLine("max "+ max+" index " + index + " m_logN" + m_logN + " " + xRe[0]);
            max = index * sampleRate / m_logN;
            Console.WriteLine("max " + max);
        

        /**
         * Do bit reversal of specified number of places of an int
         * For example, 1101 bit-reversed is 1011
         *
         * @param   x       Number to be bit-reverse.
         * @param   numBits Number of bits in the number.
         */
        private uint BitReverse(
            uint x,
            uint numBits) 
            uint y = 0;
            for (uint i = 0; i < numBits; i++) 
                y <<= 1;
                y |= x & 0x0001;
                x >>= 1;
            
            return y;
        
        public static void Main(String[] args) 
            // BinaryReader reader = new BinaryReader(System.IO.File.OpenRead(@"C:\Users\Duke\Desktop\e.wav"));
            BinaryReader reader = new BinaryReader(File.Open(@"440.wav", FileMode.Open));
            //Read the wave file header from the buffer. 

            int chunkID = reader.ReadInt32();
            int fileSize = reader.ReadInt32();
            int riffType = reader.ReadInt32();
            int fmtID = reader.ReadInt32();
            int fmtSize = reader.ReadInt32();
            int fmtCode = reader.ReadInt16();
            int channels = reader.ReadInt16();
            sampleRate = reader.ReadInt32();
            int fmtAvgBPS = reader.ReadInt32();
            int fmtBlockAlign = reader.ReadInt16();
            int bitDepth = reader.ReadInt16();

            if (fmtSize == 18) 
                // Read any extra values
                int fmtExtraSize = reader.ReadInt16();
                reader.ReadBytes(fmtExtraSize);
            

            int dataID = reader.ReadInt32();
            int dataSize = reader.ReadInt32();


            // Store the audio data of the wave file to a byte array. 

            byte[] byteArray = reader.ReadBytes(dataSize);
            /*    for (int i = 0; i < byteArray.Length; i++) 
                    Console.Write(byteArray[i] + " ");
                */

            byte[] data = byteArray;
            double[] arrRe = new double[data.Length];
            for (int i = 0; i < arrRe.Length; i++) 
                arrRe[i] = data[i] / 32768.0;
            
            double[] arrI = new double[data.Length];
            for (int i = 0; i < arrRe.Length; i++) 
                arrI[i] = 0;
            

            /**
       * Initialize class to perform FFT of specified size.
       *
       * @param logN    Log2 of FFT length. e.g. for 512 pt FFT, logN = 9.
       */
            Console.WriteLine();
            FFT2 fft2 = new FFT2();
            uint logN = (uint)Math.Log(data.Length, 2);
            fft2.init(logN);

            fft2.run(arrRe, arrI);
            // After this you have to split that byte array for each channel (Left,Right)
            // Wav supports many channels, so you have to read channel from header
            Console.ReadLine();
        
    

【问题讨论】:

【参考方案1】:

您需要解决一些问题:

您没有在 FFT 之前应用 window function - 这在一般情况下会导致 spectral leakage 并且您可能会得到误导性结果,尤其是在寻找峰时,因为会有“拖尾"的光谱。

在寻找峰值时,您应该查看 FFT 输出箱的幅度,而不是单个实部和虚部 - magnitude = sqrt(re^2 +im^2)(尽管您无需担心sqrt 如果您只是在寻找峰值)。

已经确定了一个峰值,您需要将 bin 索引转换为频率 - 如果峰值位于 bin 索引 i,则相应的频率将为 i * Fs / N,其中Fs 是以 Hz 为单位的采样率,@ 987654329@ 是 FFT 大小。

对于实数到复数 FFT,您可以忽略第二个 N / 2 个输出 bin,因为它们只是前 N / 2 个 bin 的复共轭镜像

(请参阅this answer 以获得对上述内容的更全面解释。)

【讨论】:

感谢您的帮助(因为声誉低,我无法投票。我在 run() 函数结束时进行了编辑,问题是这两个数组的最大总和是index = 0。所以我得到 index * sampleRate / m_logN = 0 * 220500/17 项目下载链接更新为。 如果您尚未在 FFT 之前添加窗口函数,那么您通常会在 bin 0 (DC) 处获得较大的值,然后在大量低频 bin 中涂抹该值。添加窗口函数将解决此问题,但请注意,如果您的模拟硬件中有任何 DC 偏移,那么这仍会在 bin 0 处为您提供(较小的)峰值。 太棒了。我将在一天内检查它,因为我在过去 24 小时内都在开发。 看看这个教程,它解释了如何解决这些问题:blog.bjornroche.com/2012/07/… user1825608,我在看到您的评论后为您投票,但我想这意味着我现在不能投票,因为我想。干杯!

以上是关于FFT算法得到错误的声音频率值的主要内容,如果未能解决你的问题,请参考以下文章

如何正确 FFT 声音阵列?

FFT算法理解

Java - 使用 FFT 查找音频信号的频率和幅度

如何决定要使用多少点来做fft

声音 FFT 频率

优雅的FFT算法