iOS 上的低延迟音频输出问题(又名如何击败 AUAudioUnit sampleRate、maximumFramesToRender 和 ioBufferDuration 提交)

Posted

技术标签:

【中文标题】iOS 上的低延迟音频输出问题(又名如何击败 AUAudioUnit sampleRate、maximumFramesToRender 和 ioBufferDuration 提交)【英文标题】:Low latency audio output problems on iOS (aka How to beat AUAudioUnit sampleRate, maximumFramesToRender, and ioBufferDuration into submission) 【发布时间】:2019-11-08 07:34:56 【问题描述】:

好的,我显然在这里遗漏了一些重要的部分。我正在尝试通过网络进行低延迟音频,我的基本帧是 10 毫秒。我预计这不会有问题。我的目标手机是 iPhone X 扬声器——所以我的硬件采样率应该锁定在 48000Hz。我要求 10 毫秒,这是一个很好的偶数除数,应该是 480、960、1920 或 3840,具体取决于您要如何切片帧/样本/字节。

然而,就我的一生而言,我绝对无法让 ios 做任何我认为理智的事情。我得到了 10.667 毫秒的缓冲持续时间,这很可笑——iOS 正在竭尽全力为我提供不是 sampleRate 整数倍的缓冲区大小。更糟糕的是,该帧明显 LONG,这意味着我必须吸收的不是一个而是 两个 延迟数据包,以便能够填充该缓冲区。我根本无法更改 maximumFrameToRender,并且系统返回 0 作为我的采样率,即使它很明显是以 48000Hz 呈现。

我显然遗漏了一些重要的东西——它是什么?我是否忘记断开/连接某些东西以获得直接的硬件映射? (我的格式是 1,其中 pcmFormatFloat32——我希望 pcmFormatInt16 或 pcmFormatInt32 直接映射到硬件,因此操作系统中的某些东西可能会妨碍)指针受到赞赏,我很高兴阅读更多内容。还是 AUAudioUnit 只是半生不熟,我需要退回到更旧、更有用的 API?还是我完全错过了情节,低延迟音频的人使用了一套完全不同的音频管理功能?

感谢您的帮助 - 非常感谢。

代码输出:

2019-11-07 23:28:29.782786-0800 latencytest[3770:50382] Ready to receive user events
2019-11-07 23:28:34.727478-0800 latencytest[3770:50382] Start button pressed
2019-11-07 23:28:34.727745-0800 latencytest[3770:50382] Launching auxiliary thread
2019-11-07 23:28:34.729278-0800 latencytest[3770:50445] Thread main started
2019-11-07 23:28:35.006005-0800 latencytest[3770:50445] Sample rate: 0
2019-11-07 23:28:35.016935-0800 latencytest[3770:50445] Buffer duration: 0.010667
2019-11-07 23:28:35.016970-0800 latencytest[3770:50445] Number of output busses: 2
2019-11-07 23:28:35.016989-0800 latencytest[3770:50445] Max frames: 4096
2019-11-07 23:28:35.017010-0800 latencytest[3770:50445] Can perform output: 1
2019-11-07 23:28:35.017023-0800 latencytest[3770:50445] Output Enabled: 1
2019-11-07 23:28:35.017743-0800 latencytest[3770:50445] Bus channels: 2
2019-11-07 23:28:35.017864-0800 latencytest[3770:50445] Bus format: 1
2019-11-07 23:28:35.017962-0800 latencytest[3770:50445] Bus rate: 0
2019-11-07 23:28:35.018039-0800 latencytest[3770:50445] Sleeping 0
2019-11-07 23:28:35.018056-0800 latencytest[3770:50445] Buffer count: 2 4096
2019-11-07 23:28:36.023220-0800 latencytest[3770:50445] Sleeping 1
2019-11-07 23:28:36.023400-0800 latencytest[3770:50445] Buffer count: 190 389120
2019-11-07 23:28:37.028610-0800 latencytest[3770:50445] Sleeping 2
2019-11-07 23:28:37.028790-0800 latencytest[3770:50445] Buffer count: 378 774144
2019-11-07 23:28:38.033983-0800 latencytest[3770:50445] Sleeping 3
2019-11-07 23:28:38.034142-0800 latencytest[3770:50445] Buffer count: 566 1159168
2019-11-07 23:28:39.039333-0800 latencytest[3770:50445] Sleeping 4
2019-11-07 23:28:39.039534-0800 latencytest[3770:50445] Buffer count: 756 1548288
2019-11-07 23:28:40.041787-0800 latencytest[3770:50445] Sleeping 5
2019-11-07 23:28:40.041943-0800 latencytest[3770:50445] Buffer count: 944 1933312
2019-11-07 23:28:41.042878-0800 latencytest[3770:50445] Sleeping 6
2019-11-07 23:28:41.043037-0800 latencytest[3770:50445] Buffer count: 1132 2318336
2019-11-07 23:28:42.048219-0800 latencytest[3770:50445] Sleeping 7
2019-11-07 23:28:42.048375-0800 latencytest[3770:50445] Buffer count: 1320 2703360
2019-11-07 23:28:43.053613-0800 latencytest[3770:50445] Sleeping 8
2019-11-07 23:28:43.053771-0800 latencytest[3770:50445] Buffer count: 1508 3088384
2019-11-07 23:28:44.058961-0800 latencytest[3770:50445] Sleeping 9
2019-11-07 23:28:44.059119-0800 latencytest[3770:50445] Buffer count: 1696 3473408

实际代码:

import UIKit

import os.log

import Foundation
import AudioToolbox
import AVFoundation

class AuxiliaryWork: Thread 
    let II_SAMPLE_RATE = 48000

    var iiStopRequested: Int32 = 0;  // Int32 is normally guaranteed to be atomic on most architectures

    var iiBufferFillCount: Int32 = 0;
    var iiBufferByteCount: Int32 = 0;

    func requestStop() 
        iiStopRequested = 1;
    

    func myAVAudioSessionInterruptionNotificationHandler(notification: Notification ) -> Void 
        os_log(OSLogType.info, "AVAudioSession Interrupted: %s", notification.debugDescription)
    

    func myAudioUnitProvider(actionFlags: UnsafeMutablePointer<AudioUnitRenderActionFlags>, timestamp: UnsafePointer<AudioTimeStamp>,
                             frameCount: AUAudioFrameCount, inputBusNumber: Int, inputData: UnsafeMutablePointer<AudioBufferList>) -> AUAudioUnitStatus 
        let ppInputData = UnsafeMutableAudioBufferListPointer(inputData)
        let iiNumBuffers = ppInputData.count

        if (iiNumBuffers > 0) 
            assert(iiNumBuffers == 2)

            for bbBuffer in ppInputData 
                assert(Int(bbBuffer.mDataByteSize) == 2048)  // FIXME: This should be 960 or 1920 ...

                iiBufferFillCount += 1
                iiBufferByteCount += Int32(bbBuffer.mDataByteSize)

                memset(bbBuffer.mData, 0, Int(bbBuffer.mDataByteSize))  // Just send silence

            
         else 
            os_log(OSLogType.error, "Zero buffers from system")
            assert(iiNumBuffers != 0)  // Force crash since os_log would cause an audio hiccup due to locks anyway
        

        return noErr
    

    override func main() 
        os_log(OSLogType.info, "Thread main started")

#if os(iOS)
        let kOutputUnitSubType = kAudioUnitSubType_RemoteIO
#else
        let kOutputUnitSubType = kAudioUnitSubtype_HALOutput
#endif

        let audioSession = AVAudioSession.sharedInstance()  // FIXME: Causes the following message No Factory registered for id
        try! audioSession.setCategory(AVAudioSession.Category.playback, options: [])
        try! audioSession.setMode(AVAudioSession.Mode.measurement)

        try! audioSession.setPreferredSampleRate(48000.0)
        try! audioSession.setPreferredIOBufferDuration(0.010)

        NotificationCenter.default.addObserver(
            forName: AVAudioSession.interruptionNotification,
            object: nil,
            queue: nil,
            using: myAVAudioSessionInterruptionNotificationHandler
        )

        let ioUnitDesc = AudioComponentDescription(
            componentType: kAudioUnitType_Output,
            componentSubType: kOutputUnitSubType,
            componentManufacturer: kAudioUnitManufacturer_Apple,
            componentFlags: 0,
            componentFlagsMask: 0)

        let auUnit = try! AUAudioUnit(componentDescription: ioUnitDesc,
                                      options: AudioComponentInstantiationOptions())

        auUnit.outputProvider = myAudioUnitProvider;
        auUnit.maximumFramesToRender = 256


        try! audioSession.setActive(true)

        try! auUnit.allocateRenderResources()  // Make sure audio unit has hardware resources--we could provide the buffers from the circular buffer if we want
        try! auUnit.startHardware()


        os_log(OSLogType.info, "Sample rate: %d", audioSession.sampleRate);
        os_log(OSLogType.info, "Buffer duration: %f", audioSession.ioBufferDuration);

        os_log(OSLogType.info, "Number of output busses: %d", auUnit.outputBusses.count);
        os_log(OSLogType.info, "Max frames: %d", auUnit.maximumFramesToRender);


        os_log(OSLogType.info, "Can perform output: %d", auUnit.canPerformOutput)
        os_log(OSLogType.info, "Output Enabled: %d", auUnit.isOutputEnabled)
        //os_log(OSLogType.info, "Audio Format: %p", audioFormat)

        var bus0 = auUnit.outputBusses[0]
        os_log(OSLogType.info, "Bus channels: %d", bus0.format.channelCount)
        os_log(OSLogType.info, "Bus format: %d", bus0.format.commonFormat.rawValue)
        os_log(OSLogType.info, "Bus rate: %d", bus0.format.sampleRate)

        for ii in 0..<10 
            if (iiStopRequested != 0) 
                os_log(OSLogType.info, "Manual stop requested");
                break;
            

            os_log(OSLogType.info, "Sleeping %d", ii);
            os_log(OSLogType.info, "Buffer count: %d %d", iiBufferFillCount, iiBufferByteCount)
            Thread.sleep(forTimeInterval: 1.0);
        

        auUnit.stopHardware()
    


class FirstViewController: UIViewController 
    var thrAuxiliaryWork: AuxiliaryWork? = nil;

    override func viewDidLoad() 
        super.viewDidLoad()
        // Do any additional setup after loading the view.
    

    @IBAction func startButtonPressed(_ sender: Any) 
        os_log(OSLogType.error, "Start button pressed");
        os_log(OSLogType.error, "Launching auxiliary thread");

        thrAuxiliaryWork = AuxiliaryWork();
        thrAuxiliaryWork?.start();
    

    @IBAction func stopButtonPressed(_ sender: Any) 
        os_log(OSLogType.error, "Stop button pressed");
        os_log(OSLogType.error, "Manually stopping auxiliary thread");
        thrAuxiliaryWork?.requestStop();
    

    @IBAction func muteButtonPressed(_ sender: Any) 
        os_log(OSLogType.error, "Mute button pressed");
    

    @IBAction func unmuteButtonPressed(_ sender: Any) 
        os_log(OSLogType.error, "Unmute button pressed");
    

【问题讨论】:

【参考方案1】:

您无法通过假设 API 会为您做到这一点来击败 iOS 硅硬件。如果你想抽象硬件,你必须自己做缓冲。

为了获得最佳(最低)延迟,您的软件必须(可能是动态地)适应实际的硬件功能,这可能因设备和模式而异。

硬件采样率似乎是 44.1ksps(较旧的 iOS 设备)、48ksps(较新的 arm64 iOS 设备)或其整数倍(插入非 AirPod 蓝牙耳机或外部 ADC 时可能会出现其他速率) .实际的硬件 D​​MA(或等效)缓冲区的大小似乎总是 2 的幂,在最新设备上可能低至 64 个样本。然而,各种 iOS 省电模式会将缓冲区大小(2 的幂)增加到 4k 样本,尤其是在较旧的 iOS 设备上。如果您请求的采样率不是硬件速率,操作系统可能会将缓冲区重新采样到不同于 2 的幂的大小,并且如果重采样率不是精确整数,则此大小可以从音频单元回调更改为后续回调.

音频单元是可通过 iOS 设备上的公共 API 访问的最低级别。其他一切都建立在上面,因此可能会产生更大的延迟。例如,如果您使用具有非硬件缓冲区大小的音频队列 API,操作系统将在内部使用 2 的幂次方音频缓冲区来访问硬件,并将它们切碎或部分连接以返回或获取音频队列缓冲区非硬件尺寸。速度较慢且紧张。

iOS API 远非半生不熟,长期以来,iOS API 是唯一可在手机和平​​板电脑上用于现场低延迟音乐表演的 API。但是通过开发与硬件相匹配的软件。

【讨论】:

有没有办法从硬件缓冲区大小(例如 512)中抽象出来,以便使用仅支持 480 缓冲区大小的噪声抑制库? 无锁环形缓冲区或循环 FIFO。放入数据。轮询并取出数据。如果足够,任何数量。 谢谢。问题是 AUAudioUnit API 需要将音频数据放入输出缓冲区,否则它只会产生音频弹出。例如,如果缓冲区大小为 256,我需要等待第二个缓冲区才能填充环形缓冲区,那么它怎么会这样工作呢? 没错,总是将数据放在环形缓冲区中,等到环形缓冲区中有足够的数据用于您需要做的任何事情,然后使用您需要的数据。将其余部分留在 FIFO 中,直到以后。也永远不要在音频回调中处理东西,使用另一个线程从环形缓冲区中取出东西。 如果不清楚,请在单独的问题中询问如何使用实时线程安全环形缓冲区。但是这个问题可能是一个重复,所以先搜索。

以上是关于iOS 上的低延迟音频输出问题(又名如何击败 AUAudioUnit sampleRate、maximumFramesToRender 和 ioBufferDuration 提交)的主要内容,如果未能解决你的问题,请参考以下文章

适用于 Android 的低延迟音频 API?

Linux 上的低延迟串行通信

AU如何导出音频?

您如何找到音频延迟? (Windows/OSX)

如何设计一款跨平台低延迟的RTMP/RTSP直播播放器

抖音世界杯直播的低延迟是怎么做到的?