iOS 上的低延迟音频输出问题(又名如何击败 AUAudioUnit sampleRate、maximumFramesToRender 和 ioBufferDuration 提交)
Posted
技术标签:
【中文标题】iOS 上的低延迟音频输出问题(又名如何击败 AUAudioUnit sampleRate、maximumFramesToRender 和 ioBufferDuration 提交)【英文标题】:Low latency audio output problems on iOS (aka How to beat AUAudioUnit sampleRate, maximumFramesToRender, and ioBufferDuration into submission) 【发布时间】:2019-11-08 07:34:56 【问题描述】:好的,我显然在这里遗漏了一些重要的部分。我正在尝试通过网络进行低延迟音频,我的基本帧是 10 毫秒。我预计这不会有问题。我的目标手机是 iPhone X 扬声器——所以我的硬件采样率应该锁定在 48000Hz。我要求 10 毫秒,这是一个很好的偶数除数,应该是 480、960、1920 或 3840,具体取决于您要如何切片帧/样本/字节。
然而,就我的一生而言,我绝对无法让 ios 做任何我认为理智的事情。我得到了 10.667 毫秒的缓冲持续时间,这很可笑——iOS 正在竭尽全力为我提供不是 sampleRate 整数倍的缓冲区大小。更糟糕的是,该帧明显 LONG,这意味着我必须吸收的不是一个而是 两个 延迟数据包,以便能够填充该缓冲区。我根本无法更改 maximumFrameToRender,并且系统返回 0 作为我的采样率,即使它很明显是以 48000Hz 呈现。
我显然遗漏了一些重要的东西——它是什么?我是否忘记断开/连接某些东西以获得直接的硬件映射? (我的格式是 1,其中 pcmFormatFloat32——我希望 pcmFormatInt16 或 pcmFormatInt32 直接映射到硬件,因此操作系统中的某些东西可能会妨碍)指针受到赞赏,我很高兴阅读更多内容。还是 AUAudioUnit 只是半生不熟,我需要退回到更旧、更有用的 API?还是我完全错过了情节,低延迟音频的人使用了一套完全不同的音频管理功能?
感谢您的帮助 - 非常感谢。
代码输出:
2019-11-07 23:28:29.782786-0800 latencytest[3770:50382] Ready to receive user events
2019-11-07 23:28:34.727478-0800 latencytest[3770:50382] Start button pressed
2019-11-07 23:28:34.727745-0800 latencytest[3770:50382] Launching auxiliary thread
2019-11-07 23:28:34.729278-0800 latencytest[3770:50445] Thread main started
2019-11-07 23:28:35.006005-0800 latencytest[3770:50445] Sample rate: 0
2019-11-07 23:28:35.016935-0800 latencytest[3770:50445] Buffer duration: 0.010667
2019-11-07 23:28:35.016970-0800 latencytest[3770:50445] Number of output busses: 2
2019-11-07 23:28:35.016989-0800 latencytest[3770:50445] Max frames: 4096
2019-11-07 23:28:35.017010-0800 latencytest[3770:50445] Can perform output: 1
2019-11-07 23:28:35.017023-0800 latencytest[3770:50445] Output Enabled: 1
2019-11-07 23:28:35.017743-0800 latencytest[3770:50445] Bus channels: 2
2019-11-07 23:28:35.017864-0800 latencytest[3770:50445] Bus format: 1
2019-11-07 23:28:35.017962-0800 latencytest[3770:50445] Bus rate: 0
2019-11-07 23:28:35.018039-0800 latencytest[3770:50445] Sleeping 0
2019-11-07 23:28:35.018056-0800 latencytest[3770:50445] Buffer count: 2 4096
2019-11-07 23:28:36.023220-0800 latencytest[3770:50445] Sleeping 1
2019-11-07 23:28:36.023400-0800 latencytest[3770:50445] Buffer count: 190 389120
2019-11-07 23:28:37.028610-0800 latencytest[3770:50445] Sleeping 2
2019-11-07 23:28:37.028790-0800 latencytest[3770:50445] Buffer count: 378 774144
2019-11-07 23:28:38.033983-0800 latencytest[3770:50445] Sleeping 3
2019-11-07 23:28:38.034142-0800 latencytest[3770:50445] Buffer count: 566 1159168
2019-11-07 23:28:39.039333-0800 latencytest[3770:50445] Sleeping 4
2019-11-07 23:28:39.039534-0800 latencytest[3770:50445] Buffer count: 756 1548288
2019-11-07 23:28:40.041787-0800 latencytest[3770:50445] Sleeping 5
2019-11-07 23:28:40.041943-0800 latencytest[3770:50445] Buffer count: 944 1933312
2019-11-07 23:28:41.042878-0800 latencytest[3770:50445] Sleeping 6
2019-11-07 23:28:41.043037-0800 latencytest[3770:50445] Buffer count: 1132 2318336
2019-11-07 23:28:42.048219-0800 latencytest[3770:50445] Sleeping 7
2019-11-07 23:28:42.048375-0800 latencytest[3770:50445] Buffer count: 1320 2703360
2019-11-07 23:28:43.053613-0800 latencytest[3770:50445] Sleeping 8
2019-11-07 23:28:43.053771-0800 latencytest[3770:50445] Buffer count: 1508 3088384
2019-11-07 23:28:44.058961-0800 latencytest[3770:50445] Sleeping 9
2019-11-07 23:28:44.059119-0800 latencytest[3770:50445] Buffer count: 1696 3473408
实际代码:
import UIKit
import os.log
import Foundation
import AudioToolbox
import AVFoundation
class AuxiliaryWork: Thread
let II_SAMPLE_RATE = 48000
var iiStopRequested: Int32 = 0; // Int32 is normally guaranteed to be atomic on most architectures
var iiBufferFillCount: Int32 = 0;
var iiBufferByteCount: Int32 = 0;
func requestStop()
iiStopRequested = 1;
func myAVAudioSessionInterruptionNotificationHandler(notification: Notification ) -> Void
os_log(OSLogType.info, "AVAudioSession Interrupted: %s", notification.debugDescription)
func myAudioUnitProvider(actionFlags: UnsafeMutablePointer<AudioUnitRenderActionFlags>, timestamp: UnsafePointer<AudioTimeStamp>,
frameCount: AUAudioFrameCount, inputBusNumber: Int, inputData: UnsafeMutablePointer<AudioBufferList>) -> AUAudioUnitStatus
let ppInputData = UnsafeMutableAudioBufferListPointer(inputData)
let iiNumBuffers = ppInputData.count
if (iiNumBuffers > 0)
assert(iiNumBuffers == 2)
for bbBuffer in ppInputData
assert(Int(bbBuffer.mDataByteSize) == 2048) // FIXME: This should be 960 or 1920 ...
iiBufferFillCount += 1
iiBufferByteCount += Int32(bbBuffer.mDataByteSize)
memset(bbBuffer.mData, 0, Int(bbBuffer.mDataByteSize)) // Just send silence
else
os_log(OSLogType.error, "Zero buffers from system")
assert(iiNumBuffers != 0) // Force crash since os_log would cause an audio hiccup due to locks anyway
return noErr
override func main()
os_log(OSLogType.info, "Thread main started")
#if os(iOS)
let kOutputUnitSubType = kAudioUnitSubType_RemoteIO
#else
let kOutputUnitSubType = kAudioUnitSubtype_HALOutput
#endif
let audioSession = AVAudioSession.sharedInstance() // FIXME: Causes the following message No Factory registered for id
try! audioSession.setCategory(AVAudioSession.Category.playback, options: [])
try! audioSession.setMode(AVAudioSession.Mode.measurement)
try! audioSession.setPreferredSampleRate(48000.0)
try! audioSession.setPreferredIOBufferDuration(0.010)
NotificationCenter.default.addObserver(
forName: AVAudioSession.interruptionNotification,
object: nil,
queue: nil,
using: myAVAudioSessionInterruptionNotificationHandler
)
let ioUnitDesc = AudioComponentDescription(
componentType: kAudioUnitType_Output,
componentSubType: kOutputUnitSubType,
componentManufacturer: kAudioUnitManufacturer_Apple,
componentFlags: 0,
componentFlagsMask: 0)
let auUnit = try! AUAudioUnit(componentDescription: ioUnitDesc,
options: AudioComponentInstantiationOptions())
auUnit.outputProvider = myAudioUnitProvider;
auUnit.maximumFramesToRender = 256
try! audioSession.setActive(true)
try! auUnit.allocateRenderResources() // Make sure audio unit has hardware resources--we could provide the buffers from the circular buffer if we want
try! auUnit.startHardware()
os_log(OSLogType.info, "Sample rate: %d", audioSession.sampleRate);
os_log(OSLogType.info, "Buffer duration: %f", audioSession.ioBufferDuration);
os_log(OSLogType.info, "Number of output busses: %d", auUnit.outputBusses.count);
os_log(OSLogType.info, "Max frames: %d", auUnit.maximumFramesToRender);
os_log(OSLogType.info, "Can perform output: %d", auUnit.canPerformOutput)
os_log(OSLogType.info, "Output Enabled: %d", auUnit.isOutputEnabled)
//os_log(OSLogType.info, "Audio Format: %p", audioFormat)
var bus0 = auUnit.outputBusses[0]
os_log(OSLogType.info, "Bus channels: %d", bus0.format.channelCount)
os_log(OSLogType.info, "Bus format: %d", bus0.format.commonFormat.rawValue)
os_log(OSLogType.info, "Bus rate: %d", bus0.format.sampleRate)
for ii in 0..<10
if (iiStopRequested != 0)
os_log(OSLogType.info, "Manual stop requested");
break;
os_log(OSLogType.info, "Sleeping %d", ii);
os_log(OSLogType.info, "Buffer count: %d %d", iiBufferFillCount, iiBufferByteCount)
Thread.sleep(forTimeInterval: 1.0);
auUnit.stopHardware()
class FirstViewController: UIViewController
var thrAuxiliaryWork: AuxiliaryWork? = nil;
override func viewDidLoad()
super.viewDidLoad()
// Do any additional setup after loading the view.
@IBAction func startButtonPressed(_ sender: Any)
os_log(OSLogType.error, "Start button pressed");
os_log(OSLogType.error, "Launching auxiliary thread");
thrAuxiliaryWork = AuxiliaryWork();
thrAuxiliaryWork?.start();
@IBAction func stopButtonPressed(_ sender: Any)
os_log(OSLogType.error, "Stop button pressed");
os_log(OSLogType.error, "Manually stopping auxiliary thread");
thrAuxiliaryWork?.requestStop();
@IBAction func muteButtonPressed(_ sender: Any)
os_log(OSLogType.error, "Mute button pressed");
@IBAction func unmuteButtonPressed(_ sender: Any)
os_log(OSLogType.error, "Unmute button pressed");
【问题讨论】:
【参考方案1】:您无法通过假设 API 会为您做到这一点来击败 iOS 硅硬件。如果你想抽象硬件,你必须自己做缓冲。
为了获得最佳(最低)延迟,您的软件必须(可能是动态地)适应实际的硬件功能,这可能因设备和模式而异。
硬件采样率似乎是 44.1ksps(较旧的 iOS 设备)、48ksps(较新的 arm64 iOS 设备)或其整数倍(插入非 AirPod 蓝牙耳机或外部 ADC 时可能会出现其他速率) .实际的硬件 DMA(或等效)缓冲区的大小似乎总是 2 的幂,在最新设备上可能低至 64 个样本。然而,各种 iOS 省电模式会将缓冲区大小(2 的幂)增加到 4k 样本,尤其是在较旧的 iOS 设备上。如果您请求的采样率不是硬件速率,操作系统可能会将缓冲区重新采样到不同于 2 的幂的大小,并且如果重采样率不是精确整数,则此大小可以从音频单元回调更改为后续回调.
音频单元是可通过 iOS 设备上的公共 API 访问的最低级别。其他一切都建立在上面,因此可能会产生更大的延迟。例如,如果您使用具有非硬件缓冲区大小的音频队列 API,操作系统将在内部使用 2 的幂次方音频缓冲区来访问硬件,并将它们切碎或部分连接以返回或获取音频队列缓冲区非硬件尺寸。速度较慢且紧张。
iOS API 远非半生不熟,长期以来,iOS API 是唯一可在手机和平板电脑上用于现场低延迟音乐表演的 API。但是通过开发与硬件相匹配的软件。
【讨论】:
有没有办法从硬件缓冲区大小(例如 512)中抽象出来,以便使用仅支持 480 缓冲区大小的噪声抑制库? 无锁环形缓冲区或循环 FIFO。放入数据。轮询并取出数据。如果足够,任何数量。 谢谢。问题是 AUAudioUnit API 需要将音频数据放入输出缓冲区,否则它只会产生音频弹出。例如,如果缓冲区大小为 256,我需要等待第二个缓冲区才能填充环形缓冲区,那么它怎么会这样工作呢? 没错,总是将数据放在环形缓冲区中,等到环形缓冲区中有足够的数据用于您需要做的任何事情,然后使用您需要的数据。将其余部分留在 FIFO 中,直到以后。也永远不要在音频回调中处理东西,使用另一个线程从环形缓冲区中取出东西。 如果不清楚,请在单独的问题中询问如何使用实时线程安全环形缓冲区。但是这个问题可能是一个重复,所以先搜索。以上是关于iOS 上的低延迟音频输出问题(又名如何击败 AUAudioUnit sampleRate、maximumFramesToRender 和 ioBufferDuration 提交)的主要内容,如果未能解决你的问题,请参考以下文章