如何使用 AVAssetReader 在 iOS 上正确读取解码的 PCM 样本——目前解码不正确

Posted 2023-02-25

技术标签:

【中文标题】如何使用 AVAssetReader 在 iOS 上正确读取解码的 PCM 样本——目前解码不正确【英文标题】：How to correctly read decoded PCM samples on iOS using AVAssetReader -- currently incorrect decoding 【发布时间】：2012-02-20 16:03:19 【问题描述】：

我目前正在开发一个应用程序，作为我计算机科学学士学位的一部分。该应用程序会将来自 iPhone 硬件（加速度计、gps）的数据与正在播放的音乐关联起来。

该项目仍处于起步阶段，仅工作了 2 个月。

我现在需要帮助的时刻是从 iTunes 库中的歌曲中读取 PCM 样本，并使用音频单元播放它们。目前，我想要执行的实现如下：从 iTunes 中选择一首随机歌曲，并在需要时从中读取样本，并将其存储在缓冲区中，我们称之为 sampleBuffer。稍后在消费者模型中，音频单元（具有混音器和 remoteIO 输出）有一个回调，我只需将所需数量的样本从 sampleBuffer 复制到回调中指定的缓冲区中。然后我通过扬声器听到的不是我所期望的。我能认出它正在播放这首歌，但它似乎被错误解码并且有很多噪音！我附上了一张显示前半秒的图像（24576 个样本 @ 44.1kHz），这与正常外观的输出不同。在进入清单之前，我检查了文件是否损坏，同样我已经为缓冲区编写了测试用例（所以我知道缓冲区不会改变样本），尽管这可能不是最好的方法（有些人会争论走音频队列路线），我想对样本进行各种操作，以及在歌曲完成之前更改歌曲，重新安排播放的歌曲等。此外，音频中可能存在一些不正确的设置但是，显示样本的图形（显示样本解码不正确）是直接从缓冲区中获取的，因此我现在只想解决为什么从磁盘读取和解码无法正常工作的问题。现在我只是想通过工作来玩。无法发布图片，因为是 *** 的新手，所以这里是图片的链接：http://i.stack.imgur.com/RHjlv.jpg

上市：

这是我设置用于 AVAssetReaderAudioMixOutput 的 audioReadSettigns 的地方

// Set the read settings
    audioReadSettings = [[NSMutableDictionary alloc] init];
    [audioReadSettings setValue:[NSNumber numberWithInt:kAudioFormatLinearPCM]
                         forKey:AVFormatIDKey];
    [audioReadSettings setValue:[NSNumber numberWithInt:16] forKey:AVLinearPCMBitDepthKey];
    [audioReadSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsBigEndianKey];
    [audioReadSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsFloatKey];
    [audioReadSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsNonInterleaved];
    [audioReadSettings setValue:[NSNumber numberWithFloat:44100.0] forKey:AVSampleRateKey];

现在下面的代码清单是一个接收带有歌曲persistant_id的NSString的方法：

-(BOOL)setNextSongID:(NSString*)persistand_id 

assert(persistand_id != nil);

MPMediaItem *song = [self getMediaItemForPersistantID:persistand_id];
NSURL *assetUrl = [song valueForProperty:MPMediaItemPropertyAssetURL];
AVURLAsset *songAsset = [AVURLAsset URLAssetWithURL:assetUrl 
                                            options:[NSDictionary dictionaryWithObject:[NSNumber numberWithBool:YES] 
                                                                                forKey:AVURLAssetPreferPreciseDurationAndTimingKey]];


NSError *assetError = nil;

assetReader = [[AVAssetReader assetReaderWithAsset:songAsset error:&assetError] retain];

if (assetError) 
    NSLog(@"error: %@", assetError);
    return NO;


CMTimeRange timeRange = CMTimeRangeMake(kCMTimeZero, songAsset.duration);
[assetReader setTimeRange:timeRange];

track = [[songAsset tracksWithMediaType:AVMediaTypeAudio] objectAtIndex:0];

assetReaderOutput = [AVAssetReaderAudioMixOutput assetReaderAudioMixOutputWithAudioTracks:[NSArray arrayWithObject:track]
                                                                            audiosettings:audioReadSettings];

if (![assetReader canAddOutput:assetReaderOutput]) 
    NSLog(@"cant add reader output... die!");
    return NO;


[assetReader addOutput:assetReaderOutput];
[assetReader startReading];

// just getting some basic information about the track to print
NSArray *formatDesc = ((AVAssetTrack*)[[assetReaderOutput audioTracks] objectAtIndex:0]).formatDescriptions;
for (unsigned int i = 0; i < [formatDesc count]; ++i) 
    CMAudioFormatDescriptionRef item = (CMAudioFormatDescriptionRef)[formatDesc objectAtIndex:i];
    const CAStreamBasicDescription *asDesc = (CAStreamBasicDescription*)CMAudioFormatDescriptionGetStreamBasicDescription(item);
    if (asDesc) 
        // get data
        numChannels = asDesc->mChannelsPerFrame;
        sampleRate = asDesc->mSampleRate;
        asDesc->Print();
    

[self copyEnoughSamplesToBufferForLength:24000];
return YES;

下面介绍函数-(void)copyEnoughSamplesToBufferForLength：

-(void)copyEnoughSamplesToBufferForLength:(UInt32)samples_count 

[w_lock lock];
int stillToCopy = 0;
if (sampleBuffer->numSamples() < samples_count) 
    stillToCopy = samples_count;


NSAutoreleasePool *apool = [[NSAutoreleasePool alloc] init];


CMSampleBufferRef sampleBufferRef;
SInt16 *dataBuffer = (SInt16*)malloc(8192 * sizeof(SInt16));

int a = 0;

while (stillToCopy > 0) 

    sampleBufferRef = [assetReaderOutput copyNextSampleBuffer];
    if (!sampleBufferRef) 
        // end of song or no more samples
        return;
    

    CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBufferRef);
    CMItemCount numSamplesInBuffer = CMSampleBufferGetNumSamples(sampleBufferRef);
    AudioBufferList audioBufferList;

    CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampleBufferRef,
                                                            NULL,
                                                            &audioBufferList,
                                                            sizeof(audioBufferList),
                                                            NULL,
                                                            NULL,
                                                            0,
                                                            &blockBuffer);

    int data_length = floorf(numSamplesInBuffer * 1.0f);

    int j = 0;

    for (int bufferCount=0; bufferCount < audioBufferList.mNumberBuffers; bufferCount++) 
        SInt16* samples = (SInt16 *)audioBufferList.mBuffers[bufferCount].mData;
        for (int i=0; i < numSamplesInBuffer; i++) 
            dataBuffer[j] = samples[i];
            j++;
        
    

    CFRelease(sampleBufferRef);
    sampleBuffer->putSamples(dataBuffer, j);
    stillToCopy = stillToCopy - data_length;


free(dataBuffer);
[w_lock unlock];
[apool release];

现在 sampleBuffer 将有错误解码的样本。谁能帮助我为什么会这样？这发生在我的 iTunes 资料库中的不同文件（mp3、aac、wav 等）。任何帮助将不胜感激，此外，如果您需要我的代码的任何其他列表，或者输出听起来像什么，我会根据请求附上它。过去一周我一直坐在这个上面试图调试它，但没有在网上找到任何帮助——似乎每个人都在按照我的方式做，但似乎只有我有这个问题。

非常感谢您的帮助！

彼得

【问题讨论】：

【参考方案1】：

目前，我还在从事一个项目，该项目涉及将 iTunes 库中的音频样本提取到 AudioUnit 中。

包含 audiounit 渲染回调供您参考。输入格式设置为 SInt16StereoStreamFormat。

我使用了 Michael Tyson 的循环缓冲区实现 - TPCircularBuffer 作为缓冲区存储。非常容易使用和理解！！！谢谢迈克尔！

- (void) loadBuffer:(NSURL *)assetURL_

    if (nil != self.iPodAssetReader) 
        [iTunesOperationQueue cancelAllOperations];

        [self cleanUpBuffer];
    

    NSDictionary *outputSettings = [NSDictionary dictionaryWithObjectsAndKeys:
                                    [NSNumber numberWithInt:kAudioFormatLinearPCM], AVFormatIDKey, 
                                    [NSNumber numberWithFloat:44100.0], AVSampleRateKey,
                                    [NSNumber numberWithInt:16], AVLinearPCMBitDepthKey,
                                    [NSNumber numberWithBool:NO], AVLinearPCMIsNonInterleaved,
                                    [NSNumber numberWithBool:NO], AVLinearPCMIsFloatKey,
                                    [NSNumber numberWithBool:NO], AVLinearPCMIsBigEndianKey,
                                    nil];

    AVURLAsset *asset = [AVURLAsset URLAssetWithURL:assetURL_ options:nil];
    if (asset == nil) 
        NSLog(@"asset is not defined!");
        return;
    

    NSLog(@"Total Asset Duration: %f", CMTimeGetSeconds(asset.duration));

    NSError *assetError = nil;
    self.iPodAssetReader = [AVAssetReader assetReaderWithAsset:asset error:&assetError];
    if (assetError) 
        NSLog (@"error: %@", assetError);
        return;
    

    AVAssetReaderOutput *readerOutput = [AVAssetReaderAudioMixOutput assetReaderAudioMixOutputWithAudioTracks:asset.tracks audioSettings:outputSettings];

    if (! [iPodAssetReader canAddOutput: readerOutput]) 
        NSLog (@"can't add reader output... die!");
        return;
    

    // add output reader to reader
    [iPodAssetReader addOutput: readerOutput];

    if (! [iPodAssetReader startReading]) 
        NSLog(@"Unable to start reading!");
        return;
    

    // Init circular buffer
    TPCircularBufferInit(&playbackState.circularBuffer, kTotalBufferSize);

    __block NSBlockOperation * feediPodBufferOperation = [NSBlockOperation blockOperationWithBlock:^
        while (![feediPodBufferOperation isCancelled] && iPodAssetReader.status != AVAssetReaderStatusCompleted) 
            if (iPodAssetReader.status == AVAssetReaderStatusReading) 
                // Check if the available buffer space is enough to hold at least one cycle of the sample data
                if (kTotalBufferSize - playbackState.circularBuffer.fillCount >= 32768) 
                    CMSampleBufferRef nextBuffer = [readerOutput copyNextSampleBuffer];

                    if (nextBuffer) 
                        AudioBufferList abl;
                        CMBlockBufferRef blockBuffer;
                        CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(nextBuffer, NULL, &abl, sizeof(abl), NULL, NULL, kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment, &blockBuffer);
                        UInt64 size = CMSampleBufferGetTotalSampleSize(nextBuffer);

                        int bytesCopied = TPCircularBufferProduceBytes(&playbackState.circularBuffer, abl.mBuffers[0].mData, size);

                        if (!playbackState.bufferIsReady && bytesCopied > 0) 
                            playbackState.bufferIsReady = YES;
                        

                        CFRelease(nextBuffer);
                        CFRelease(blockBuffer);
                    
                    else 
                        break;
                    
                
            
        
        NSLog(@"iPod Buffer Reading Finished");
    ];

    [iTunesOperationQueue addOperation:feediPodBufferOperation];


static OSStatus ipodRenderCallback (

                                     void                        *inRefCon,      // A pointer to a struct containing the complete audio data 
                                     //    to play, as well as state information such as the  
                                     //    first sample to play on this invocation of the callback.
                                     AudioUnitRenderActionFlags  *ioActionFlags, // Unused here. When generating audio, use ioActionFlags to indicate silence 
                                     //    between sounds; for silence, also memset the ioData buffers to 0.
                                     const AudioTimeStamp        *inTimeStamp,   // Unused here.
                                     UInt32                      inBusNumber,    // The mixer unit input bus that is requesting some new
                                     //        frames of audio data to play.
                                     UInt32                      inNumberFrames, // The number of frames of audio to provide to the buffer(s)
                                     //        pointed to by the ioData parameter.
                                     AudioBufferList             *ioData         // On output, the audio data to play. The callback's primary 
                                     //        responsibility is to fill the buffer(s) in the 
                                     //        AudioBufferList.
                                     ) 

    Audio* audioObject   = (Audio*)inRefCon;

    AudioSampleType *outSample          = (AudioSampleType *)ioData->mBuffers[0].mData;

    // Zero-out all the output samples first
    memset(outSample, 0, inNumberFrames * kUnitSize * 2);

    if ( audioObject.playingiPod && audioObject.bufferIsReady) 
        // Pull audio from circular buffer
        int32_t availableBytes;

        AudioSampleType *bufferTail     = TPCircularBufferTail(&audioObject.circularBuffer, &availableBytes);

        memcpy(outSample, bufferTail, MIN(availableBytes, inNumberFrames * kUnitSize * 2) );
        TPCircularBufferConsume(&audioObject.circularBuffer, MIN(availableBytes, inNumberFrames * kUnitSize * 2) );
        audioObject.currentSampleNum += MIN(availableBytes / (kUnitSize * 2), inNumberFrames);

        if (availableBytes <= inNumberFrames * kUnitSize * 2) 
            // Buffer is running out or playback is finished
            audioObject.bufferIsReady = NO;
            audioObject.playingiPod = NO;
            audioObject.currentSampleNum = 0;

            if ([[audioObject delegate] respondsToSelector:@selector(playbackDidFinish)]) 
                [[audioObject delegate] performSelector:@selector(playbackDidFinish)];
            
        
    

    return noErr;


- (void) setupSInt16StereoStreamFormat 

    // The AudioUnitSampleType data type is the recommended type for sample data in audio
    //    units. This obtains the byte size of the type for use in filling in the ASBD.
    size_t bytesPerSample = sizeof (AudioSampleType);

    // Fill the application audio format struct's fields to define a linear PCM, 
    //        stereo, noninterleaved stream at the hardware sample rate.
    SInt16StereoStreamFormat.mFormatID          = kAudioFormatLinearPCM;
    SInt16StereoStreamFormat.mFormatFlags       = kAudioFormatFlagsCanonical;
    SInt16StereoStreamFormat.mBytesPerPacket    = 2 * bytesPerSample;   // *** kAudioFormatFlagsCanonical <- implicit interleaved data => (left sample + right sample) per Packet 
    SInt16StereoStreamFormat.mFramesPerPacket   = 1;
    SInt16StereoStreamFormat.mBytesPerFrame     = SInt16StereoStreamFormat.mBytesPerPacket * SInt16StereoStreamFormat.mFramesPerPacket;
    SInt16StereoStreamFormat.mChannelsPerFrame  = 2;                    // 2 indicates stereo
    SInt16StereoStreamFormat.mBitsPerChannel    = 8 * bytesPerSample;
    SInt16StereoStreamFormat.mSampleRate        = graphSampleRate;


    NSLog (@"The stereo stream format for the \"iPod\" mixer input bus:");
    [self printASBD: SInt16StereoStreamFormat];

【讨论】：

非常感谢！真的很有帮助！什么是 kUnitSize？什么是 kTotalBufferSize？ @smartfaceweb ：就我而言，我使用了以下设置

#define kUnitSize               sizeof(AudioSampleType)   #define kBufferUnit             655360   #define kTotalBufferSize        kBufferUnit * kUnitSize

@infiniteloop 你能告诉我们这段代码是否也适用于 iOS 吗？根据我迄今为止对音频单元的有限研究，iOS 的音频单元功能似乎比其 OSX 对应物少得多您在填充之前检查循环缓冲区至少有 32768 个可用，这与 CMSampleBufferGetTotalSampleSize 返回的数字相同，但我想知道该大小是否会因任何原因而不同。它是什么有什么特别的含义吗？【参考方案2】：

我想有点晚了，但你可以试试这个库：

https://bitbucket.org/artgillespie/tslibraryimport

使用它将音频保存到文件后，您可以使用来自 MixerHost 的渲染回调来处理数据。

【讨论】：

【参考方案3】：

如果我是你，我会使用 kAudioUnitSubType_AudioFilePlayer 播放文件并使用单位渲染回调访问其样本。

或者

使用 ExtAudioFileRef 将样本直接提取到缓冲区。

【讨论】：

AudioFilePlayer 允许我只指定一个要播放的文件，而且它不能来自 iTunes。 ExtAudioFileRef 也在使用不允许从 iTunes 访问的音频会话（或者至少我无法让它工作）。有没有人实施过类似的东西可以帮助我？请恐怕我对 itune 库没有太多经验。这有帮助吗？ subfurther.com/blog/2010/12/13/…

以上是关于如何使用 AVAssetReader 在 iOS 上正确读取解码的 PCM 样本——目前解码不正确的主要内容，如果未能解决你的问题，请参考以下文章