如何使用 AVAssetReader 在 iOS 上正确读取解码的 PCM 样本——目前解码不正确
Posted
技术标签:
【中文标题】如何使用 AVAssetReader 在 iOS 上正确读取解码的 PCM 样本——目前解码不正确【英文标题】:How to correctly read decoded PCM samples on iOS using AVAssetReader -- currently incorrect decoding 【发布时间】:2012-02-20 16:03:19 【问题描述】:我目前正在开发一个应用程序,作为我计算机科学学士学位的一部分。该应用程序会将来自 iPhone 硬件(加速度计、gps)的数据与正在播放的音乐关联起来。
该项目仍处于起步阶段,仅工作了 2 个月。
我现在需要帮助的时刻是从 iTunes 库中的歌曲中读取 PCM 样本,并使用音频单元播放它们。 目前,我想要执行的实现如下:从 iTunes 中选择一首随机歌曲,并在需要时从中读取样本,并将其存储在缓冲区中,我们称之为 sampleBuffer。稍后在消费者模型中,音频单元(具有混音器和 remoteIO 输出)有一个回调,我只需将所需数量的样本从 sampleBuffer 复制到回调中指定的缓冲区中。然后我通过扬声器听到的不是我所期望的。我能认出它正在播放这首歌,但它似乎被错误解码并且有很多噪音!我附上了一张显示前半秒的图像(24576 个样本 @ 44.1kHz),这与正常外观的输出不同。 在进入清单之前,我检查了文件是否损坏,同样我已经为缓冲区编写了测试用例(所以我知道缓冲区不会改变样本),尽管这可能不是最好的方法(有些人会争论走音频队列路线),我想对样本进行各种操作,以及在歌曲完成之前更改歌曲,重新安排播放的歌曲等。此外,音频中可能存在一些不正确的设置但是,显示样本的图形(显示样本解码不正确)是直接从缓冲区中获取的,因此我现在只想解决为什么从磁盘读取和解码无法正常工作的问题。现在我只是想通过工作来玩。 无法发布图片,因为是 *** 的新手,所以这里是图片的链接:http://i.stack.imgur.com/RHjlv.jpg
上市:
这是我设置用于 AVAssetReaderAudioMixOutput 的 audioReadSettigns 的地方
// Set the read settings
audioReadSettings = [[NSMutableDictionary alloc] init];
[audioReadSettings setValue:[NSNumber numberWithInt:kAudioFormatLinearPCM]
forKey:AVFormatIDKey];
[audioReadSettings setValue:[NSNumber numberWithInt:16] forKey:AVLinearPCMBitDepthKey];
[audioReadSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsBigEndianKey];
[audioReadSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsFloatKey];
[audioReadSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsNonInterleaved];
[audioReadSettings setValue:[NSNumber numberWithFloat:44100.0] forKey:AVSampleRateKey];
现在下面的代码清单是一个接收带有歌曲persistant_id的NSString的方法:
-(BOOL)setNextSongID:(NSString*)persistand_id
assert(persistand_id != nil);
MPMediaItem *song = [self getMediaItemForPersistantID:persistand_id];
NSURL *assetUrl = [song valueForProperty:MPMediaItemPropertyAssetURL];
AVURLAsset *songAsset = [AVURLAsset URLAssetWithURL:assetUrl
options:[NSDictionary dictionaryWithObject:[NSNumber numberWithBool:YES]
forKey:AVURLAssetPreferPreciseDurationAndTimingKey]];
NSError *assetError = nil;
assetReader = [[AVAssetReader assetReaderWithAsset:songAsset error:&assetError] retain];
if (assetError)
NSLog(@"error: %@", assetError);
return NO;
CMTimeRange timeRange = CMTimeRangeMake(kCMTimeZero, songAsset.duration);
[assetReader setTimeRange:timeRange];
track = [[songAsset tracksWithMediaType:AVMediaTypeAudio] objectAtIndex:0];
assetReaderOutput = [AVAssetReaderAudioMixOutput assetReaderAudioMixOutputWithAudioTracks:[NSArray arrayWithObject:track]
audiosettings:audioReadSettings];
if (![assetReader canAddOutput:assetReaderOutput])
NSLog(@"cant add reader output... die!");
return NO;
[assetReader addOutput:assetReaderOutput];
[assetReader startReading];
// just getting some basic information about the track to print
NSArray *formatDesc = ((AVAssetTrack*)[[assetReaderOutput audioTracks] objectAtIndex:0]).formatDescriptions;
for (unsigned int i = 0; i < [formatDesc count]; ++i)
CMAudioFormatDescriptionRef item = (CMAudioFormatDescriptionRef)[formatDesc objectAtIndex:i];
const CAStreamBasicDescription *asDesc = (CAStreamBasicDescription*)CMAudioFormatDescriptionGetStreamBasicDescription(item);
if (asDesc)
// get data
numChannels = asDesc->mChannelsPerFrame;
sampleRate = asDesc->mSampleRate;
asDesc->Print();
[self copyEnoughSamplesToBufferForLength:24000];
return YES;
下面介绍函数-(void)copyEnoughSamplesToBufferForLength:
-(void)copyEnoughSamplesToBufferForLength:(UInt32)samples_count
[w_lock lock];
int stillToCopy = 0;
if (sampleBuffer->numSamples() < samples_count)
stillToCopy = samples_count;
NSAutoreleasePool *apool = [[NSAutoreleasePool alloc] init];
CMSampleBufferRef sampleBufferRef;
SInt16 *dataBuffer = (SInt16*)malloc(8192 * sizeof(SInt16));
int a = 0;
while (stillToCopy > 0)
sampleBufferRef = [assetReaderOutput copyNextSampleBuffer];
if (!sampleBufferRef)
// end of song or no more samples
return;
CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBufferRef);
CMItemCount numSamplesInBuffer = CMSampleBufferGetNumSamples(sampleBufferRef);
AudioBufferList audioBufferList;
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampleBufferRef,
NULL,
&audioBufferList,
sizeof(audioBufferList),
NULL,
NULL,
0,
&blockBuffer);
int data_length = floorf(numSamplesInBuffer * 1.0f);
int j = 0;
for (int bufferCount=0; bufferCount < audioBufferList.mNumberBuffers; bufferCount++)
SInt16* samples = (SInt16 *)audioBufferList.mBuffers[bufferCount].mData;
for (int i=0; i < numSamplesInBuffer; i++)
dataBuffer[j] = samples[i];
j++;
CFRelease(sampleBufferRef);
sampleBuffer->putSamples(dataBuffer, j);
stillToCopy = stillToCopy - data_length;
free(dataBuffer);
[w_lock unlock];
[apool release];
现在 sampleBuffer 将有错误解码的样本。谁能帮助我为什么会这样?这发生在我的 iTunes 资料库中的不同文件(mp3、aac、wav 等)。 任何帮助将不胜感激,此外,如果您需要我的代码的任何其他列表,或者输出听起来像什么,我会根据请求附上它。过去一周我一直坐在这个上面试图调试它,但没有在网上找到任何帮助——似乎每个人都在按照我的方式做,但似乎只有我有这个问题。
非常感谢您的帮助!
彼得
【问题讨论】:
【参考方案1】:目前,我还在从事一个项目,该项目涉及将 iTunes 库中的音频样本提取到 AudioUnit 中。
包含 audiounit 渲染回调供您参考。输入格式设置为 SInt16StereoStreamFormat。
我使用了 Michael Tyson 的循环缓冲区实现 - TPCircularBuffer 作为缓冲区存储。非常容易使用和理解!!!谢谢迈克尔!
- (void) loadBuffer:(NSURL *)assetURL_
if (nil != self.iPodAssetReader)
[iTunesOperationQueue cancelAllOperations];
[self cleanUpBuffer];
NSDictionary *outputSettings = [NSDictionary dictionaryWithObjectsAndKeys:
[NSNumber numberWithInt:kAudioFormatLinearPCM], AVFormatIDKey,
[NSNumber numberWithFloat:44100.0], AVSampleRateKey,
[NSNumber numberWithInt:16], AVLinearPCMBitDepthKey,
[NSNumber numberWithBool:NO], AVLinearPCMIsNonInterleaved,
[NSNumber numberWithBool:NO], AVLinearPCMIsFloatKey,
[NSNumber numberWithBool:NO], AVLinearPCMIsBigEndianKey,
nil];
AVURLAsset *asset = [AVURLAsset URLAssetWithURL:assetURL_ options:nil];
if (asset == nil)
NSLog(@"asset is not defined!");
return;
NSLog(@"Total Asset Duration: %f", CMTimeGetSeconds(asset.duration));
NSError *assetError = nil;
self.iPodAssetReader = [AVAssetReader assetReaderWithAsset:asset error:&assetError];
if (assetError)
NSLog (@"error: %@", assetError);
return;
AVAssetReaderOutput *readerOutput = [AVAssetReaderAudioMixOutput assetReaderAudioMixOutputWithAudioTracks:asset.tracks audioSettings:outputSettings];
if (! [iPodAssetReader canAddOutput: readerOutput])
NSLog (@"can't add reader output... die!");
return;
// add output reader to reader
[iPodAssetReader addOutput: readerOutput];
if (! [iPodAssetReader startReading])
NSLog(@"Unable to start reading!");
return;
// Init circular buffer
TPCircularBufferInit(&playbackState.circularBuffer, kTotalBufferSize);
__block NSBlockOperation * feediPodBufferOperation = [NSBlockOperation blockOperationWithBlock:^
while (![feediPodBufferOperation isCancelled] && iPodAssetReader.status != AVAssetReaderStatusCompleted)
if (iPodAssetReader.status == AVAssetReaderStatusReading)
// Check if the available buffer space is enough to hold at least one cycle of the sample data
if (kTotalBufferSize - playbackState.circularBuffer.fillCount >= 32768)
CMSampleBufferRef nextBuffer = [readerOutput copyNextSampleBuffer];
if (nextBuffer)
AudioBufferList abl;
CMBlockBufferRef blockBuffer;
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(nextBuffer, NULL, &abl, sizeof(abl), NULL, NULL, kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment, &blockBuffer);
UInt64 size = CMSampleBufferGetTotalSampleSize(nextBuffer);
int bytesCopied = TPCircularBufferProduceBytes(&playbackState.circularBuffer, abl.mBuffers[0].mData, size);
if (!playbackState.bufferIsReady && bytesCopied > 0)
playbackState.bufferIsReady = YES;
CFRelease(nextBuffer);
CFRelease(blockBuffer);
else
break;
NSLog(@"iPod Buffer Reading Finished");
];
[iTunesOperationQueue addOperation:feediPodBufferOperation];
static OSStatus ipodRenderCallback (
void *inRefCon, // A pointer to a struct containing the complete audio data
// to play, as well as state information such as the
// first sample to play on this invocation of the callback.
AudioUnitRenderActionFlags *ioActionFlags, // Unused here. When generating audio, use ioActionFlags to indicate silence
// between sounds; for silence, also memset the ioData buffers to 0.
const AudioTimeStamp *inTimeStamp, // Unused here.
UInt32 inBusNumber, // The mixer unit input bus that is requesting some new
// frames of audio data to play.
UInt32 inNumberFrames, // The number of frames of audio to provide to the buffer(s)
// pointed to by the ioData parameter.
AudioBufferList *ioData // On output, the audio data to play. The callback's primary
// responsibility is to fill the buffer(s) in the
// AudioBufferList.
)
Audio* audioObject = (Audio*)inRefCon;
AudioSampleType *outSample = (AudioSampleType *)ioData->mBuffers[0].mData;
// Zero-out all the output samples first
memset(outSample, 0, inNumberFrames * kUnitSize * 2);
if ( audioObject.playingiPod && audioObject.bufferIsReady)
// Pull audio from circular buffer
int32_t availableBytes;
AudioSampleType *bufferTail = TPCircularBufferTail(&audioObject.circularBuffer, &availableBytes);
memcpy(outSample, bufferTail, MIN(availableBytes, inNumberFrames * kUnitSize * 2) );
TPCircularBufferConsume(&audioObject.circularBuffer, MIN(availableBytes, inNumberFrames * kUnitSize * 2) );
audioObject.currentSampleNum += MIN(availableBytes / (kUnitSize * 2), inNumberFrames);
if (availableBytes <= inNumberFrames * kUnitSize * 2)
// Buffer is running out or playback is finished
audioObject.bufferIsReady = NO;
audioObject.playingiPod = NO;
audioObject.currentSampleNum = 0;
if ([[audioObject delegate] respondsToSelector:@selector(playbackDidFinish)])
[[audioObject delegate] performSelector:@selector(playbackDidFinish)];
return noErr;
- (void) setupSInt16StereoStreamFormat
// The AudioUnitSampleType data type is the recommended type for sample data in audio
// units. This obtains the byte size of the type for use in filling in the ASBD.
size_t bytesPerSample = sizeof (AudioSampleType);
// Fill the application audio format struct's fields to define a linear PCM,
// stereo, noninterleaved stream at the hardware sample rate.
SInt16StereoStreamFormat.mFormatID = kAudioFormatLinearPCM;
SInt16StereoStreamFormat.mFormatFlags = kAudioFormatFlagsCanonical;
SInt16StereoStreamFormat.mBytesPerPacket = 2 * bytesPerSample; // *** kAudioFormatFlagsCanonical <- implicit interleaved data => (left sample + right sample) per Packet
SInt16StereoStreamFormat.mFramesPerPacket = 1;
SInt16StereoStreamFormat.mBytesPerFrame = SInt16StereoStreamFormat.mBytesPerPacket * SInt16StereoStreamFormat.mFramesPerPacket;
SInt16StereoStreamFormat.mChannelsPerFrame = 2; // 2 indicates stereo
SInt16StereoStreamFormat.mBitsPerChannel = 8 * bytesPerSample;
SInt16StereoStreamFormat.mSampleRate = graphSampleRate;
NSLog (@"The stereo stream format for the \"iPod\" mixer input bus:");
[self printASBD: SInt16StereoStreamFormat];
【讨论】:
非常感谢!真的很有帮助! 什么是 kUnitSize?什么是 kTotalBufferSize? @smartfaceweb :就我而言,我使用了以下设置#define kUnitSize sizeof(AudioSampleType) #define kBufferUnit 655360 #define kTotalBufferSize kBufferUnit * kUnitSize
@infiniteloop 你能告诉我们这段代码是否也适用于 iOS 吗?根据我迄今为止对音频单元的有限研究,iOS 的音频单元功能似乎比其 OSX 对应物少得多
您在填充之前检查循环缓冲区至少有 32768 个可用,这与 CMSampleBufferGetTotalSampleSize 返回的数字相同,但我想知道该大小是否会因任何原因而不同。它是什么有什么特别的含义吗?【参考方案2】:
我想有点晚了,但你可以试试这个库:
https://bitbucket.org/artgillespie/tslibraryimport
使用它将音频保存到文件后,您可以使用来自 MixerHost 的渲染回调来处理数据。
【讨论】:
【参考方案3】:如果我是你,我会使用 kAudioUnitSubType_AudioFilePlayer 播放文件并使用单位渲染回调访问其样本。
或者
使用 ExtAudioFileRef 将样本直接提取到缓冲区。
【讨论】:
AudioFilePlayer 允许我只指定一个要播放的文件,而且它不能来自 iTunes。 ExtAudioFileRef 也在使用不允许从 iTunes 访问的音频会话(或者至少我无法让它工作)。有没有人实施过类似的东西可以帮助我?请 恐怕我对 itune 库没有太多经验。这有帮助吗? subfurther.com/blog/2010/12/13/…以上是关于如何使用 AVAssetReader 在 iOS 上正确读取解码的 PCM 样本——目前解码不正确的主要内容,如果未能解决你的问题,请参考以下文章
如何使用 AVAssetReader 和 AVAssetWriter 创建 AAC 文件?
AVAssetReader+AVAssetReaderTrackOutput播放视频