使用 AVAssetReader 绘制波形

Posted

技术标签:

【中文标题】使用 AVAssetReader 绘制波形【英文标题】:Drawing waveform with AVAssetReader 【发布时间】:2011-06-29 07:12:40 【问题描述】:

我使用assetUrl 从iPod 库中读取歌曲(代码名为audioUrl) 我可以以多种方式播放它,我可以剪切它,我可以用它进行一些进动,但是...... 我真的不明白我要用这个 CMSampleBufferRef 做什么来获取绘制波形的数据!我需要有关峰值的信息,我如何才能以这种方式(也许是另一种方式)获得它?

    AVAssetTrack * songTrack = [audioUrl.tracks objectAtIndex:0];
    AVAssetReaderTrackOutput * output = [[AVAssetReaderTrackOutput alloc] initWithTrack:songTrack outputSettings:nil];
    [reader addOutput:output];
    [output release];

    NSMutableData * fullSongData = [[NSMutableData alloc] init];
    [reader startReading];

    while (reader.status == AVAssetReaderStatusReading)

        AVAssetReaderTrackOutput * trackOutput = 
        (AVAssetReaderTrackOutput *)[reader.outputs objectAtIndex:0];

        CMSampleBufferRef sampleBufferRef = [trackOutput copyNextSampleBuffer];

        if (sampleBufferRef)/* what I gonna do with this? */

请帮帮我!

【问题讨论】:

【参考方案1】:

我正在寻找类似的东西,并决定“自己动手”。 我意识到这是一个旧帖子,但如果其他人正在寻找这个,这是我的解决方案。它相对快速和肮脏,并将图像标准化为“全尺寸”。 它创建的图像是“宽的”,即您需要将它们放在 UIScrollView 中或以其他方式管理显示。

这是基于给this question的一些答案

样本输出

编辑:我已经添加了平均和渲染方法的对数版本,请参阅此消息的末尾以了解替代版本和比较输出。我个人更喜欢原始的线性版本,但已决定发布它,以防有人可以改进所使用的算法。

您将需要这些导入:

#import <MediaPlayer/MediaPlayer.h>
#import <AVFoundation/AVFoundation.h>

首先,一个通用的渲染方法,它接受一个指向平均样本数据的指针, 并返回一个 UIImage。请注意,这些样本不是可播放的音频样本。

-(UIImage *) audioImageGraph:(SInt16 *) samples
                normalizeMax:(SInt16) normalizeMax
                 sampleCount:(NSInteger) sampleCount 
                channelCount:(NSInteger) channelCount
                 imageHeight:(float) imageHeight 

    CGSize imageSize = CGSizeMake(sampleCount, imageHeight);
    UIGraphicsBeginImageContext(imageSize);
    CGContextRef context = UIGraphicsGetCurrentContext();

    CGContextSetFillColorWithColor(context, [UIColor blackColor].CGColor);
    CGContextSetAlpha(context,1.0);
    CGRect rect;
    rect.size = imageSize;
    rect.origin.x = 0;
    rect.origin.y = 0;

    CGColorRef leftcolor = [[UIColor whiteColor] CGColor];
    CGColorRef rightcolor = [[UIColor redColor] CGColor];

    CGContextFillRect(context, rect);

    CGContextSetLineWidth(context, 1.0);

    float halfGraphHeight = (imageHeight / 2) / (float) channelCount ;
    float centerLeft = halfGraphHeight;
    float centerRight = (halfGraphHeight*3) ; 
    float sampleAdjustmentFactor = (imageHeight/ (float) channelCount) / (float) normalizeMax;

    for (NSInteger intSample = 0 ; intSample < sampleCount ; intSample ++ ) 
        SInt16 left = *samples++;
        float pixels = (float) left;
        pixels *= sampleAdjustmentFactor;
        CGContextMoveToPoint(context, intSample, centerLeft-pixels);
        CGContextAddLineToPoint(context, intSample, centerLeft+pixels);
        CGContextSetStrokeColorWithColor(context, leftcolor);
        CGContextStrokePath(context);

        if (channelCount==2) 
            SInt16 right = *samples++;
            float pixels = (float) right;
            pixels *= sampleAdjustmentFactor;
            CGContextMoveToPoint(context, intSample, centerRight - pixels);
            CGContextAddLineToPoint(context, intSample, centerRight + pixels);
            CGContextSetStrokeColorWithColor(context, rightcolor);
            CGContextStrokePath(context); 
        
    

    // Create new image
    UIImage *newImage = UIGraphicsGetImageFromCurrentImageContext();

    // Tidy up
    UIGraphicsEndImageContext();   

    return newImage;

接下来,一个接受 AVURLAsset 并返回 PNG 图像数据的方法

- (NSData *) renderPNGAudioPictogramForAsset:(AVURLAsset *)songAsset 

    NSError * error = nil;
    AVAssetReader * reader = [[AVAssetReader alloc] initWithAsset:songAsset error:&error];
    AVAssetTrack * songTrack = [songAsset.tracks objectAtIndex:0];

    NSDictionary* outputSettingsDict = [[NSDictionary alloc] initWithObjectsAndKeys:
                                        [NSNumber numberWithInt:kAudioFormatLinearPCM],AVFormatIDKey,
                                        //     [NSNumber numberWithInt:44100.0],AVSampleRateKey, /*Not Supported*/
                                        //     [NSNumber numberWithInt: 2],AVNumberOfChannelsKey,    /*Not Supported*/
                                        [NSNumber numberWithInt:16],AVLinearPCMBitDepthKey,
                                        [NSNumber numberWithBool:NO],AVLinearPCMIsBigEndianKey,
                                        [NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey,
                                        [NSNumber numberWithBool:NO],AVLinearPCMIsNonInterleaved,
                                        nil];

    AVAssetReaderTrackOutput* output = [[AVAssetReaderTrackOutput alloc] initWithTrack:songTrack outputSettings:outputSettingsDict];

    [reader addOutput:output];
    [output release];

    UInt32 sampleRate,channelCount;

    NSArray* formatDesc = songTrack.formatDescriptions;
    for(unsigned int i = 0; i < [formatDesc count]; ++i) 
        CMAudioFormatDescriptionRef item = (CMAudioFormatDescriptionRef)[formatDesc objectAtIndex:i];
        const AudiostreamBasicDescription* fmtDesc = CMAudioFormatDescriptionGetStreamBasicDescription (item);
        if(fmtDesc ) 

            sampleRate = fmtDesc->mSampleRate;
            channelCount = fmtDesc->mChannelsPerFrame;

            //    NSLog(@"channels:%u, bytes/packet: %u, sampleRate %f",fmtDesc->mChannelsPerFrame, fmtDesc->mBytesPerPacket,fmtDesc->mSampleRate);
        
    

    UInt32 bytesPerSample = 2 * channelCount;
    SInt16 normalizeMax = 0;

    NSMutableData * fullSongData = [[NSMutableData alloc] init];
    [reader startReading];

    UInt64 totalBytes = 0;         
    SInt64 totalLeft = 0;
    SInt64 totalRight = 0;
    NSInteger sampleTally = 0;

    NSInteger samplesPerPixel = sampleRate / 50;

    while (reader.status == AVAssetReaderStatusReading)

        AVAssetReaderTrackOutput * trackOutput = (AVAssetReaderTrackOutput *)[reader.outputs objectAtIndex:0];
        CMSampleBufferRef sampleBufferRef = [trackOutput copyNextSampleBuffer];

        if (sampleBufferRef)
            CMBlockBufferRef blockBufferRef = CMSampleBufferGetDataBuffer(sampleBufferRef);

            size_t length = CMBlockBufferGetDataLength(blockBufferRef);
            totalBytes += length;

            NSAutoreleasePool *wader = [[NSAutoreleasePool alloc] init];

            NSMutableData * data = [NSMutableData dataWithLength:length];
            CMBlockBufferCopyDataBytes(blockBufferRef, 0, length, data.mutableBytes);

            SInt16 * samples = (SInt16 *) data.mutableBytes;
            int sampleCount = length / bytesPerSample;
            for (int i = 0; i < sampleCount ; i ++) 

                SInt16 left = *samples++;
                totalLeft  += left;

                SInt16 right;
                if (channelCount==2) 
                    right = *samples++;
                    totalRight += right;
                

                sampleTally++;

                if (sampleTally > samplesPerPixel) 

                    left  = totalLeft / sampleTally; 

                    SInt16 fix = abs(left);
                    if (fix > normalizeMax) 
                        normalizeMax = fix;
                    

                    [fullSongData appendBytes:&left length:sizeof(left)];

                    if (channelCount==2) 
                        right = totalRight / sampleTally; 

                        SInt16 fix = abs(right);
                        if (fix > normalizeMax) 
                            normalizeMax = fix;
                        

                        [fullSongData appendBytes:&right length:sizeof(right)];
                    

                    totalLeft   = 0;
                    totalRight  = 0;
                    sampleTally = 0;
                
            

           [wader drain];

            CMSampleBufferInvalidate(sampleBufferRef);
            CFRelease(sampleBufferRef);
        
    

    NSData * finalData = nil;

    if (reader.status == AVAssetReaderStatusFailed || reader.status == AVAssetReaderStatusUnknown)
        // Something went wrong. return nil

        return nil;
    

    if (reader.status == AVAssetReaderStatusCompleted)

        NSLog(@"rendering output graphics using normalizeMax %d",normalizeMax);

        UIImage *test = [self audioImageGraph:(SInt16 *) 
                         fullSongData.bytes 
                                 normalizeMax:normalizeMax 
                                  sampleCount:fullSongData.length / 4 
                                 channelCount:2
                                  imageHeight:100];

        finalData = imageToData(test);
            

    [fullSongData release];
    [reader release];

    return finalData;

高级选项: 最后,如果您希望能够使用 AVAudioPlayer 播放音频,则需要缓存 它到您的应用程序的捆绑缓存文件夹。既然我这样做了,我决定缓存图像数据 此外,并将整个事物包装到 UIImage 类别中。您需要包含this open source offering 来提取音频,以及来自here 的一些代码来处理一些后台线程功能。

首先,一些定义,以及一些用于处理路径名等的通用类方法

//#define imgExt @"jpg"
//#define imageToData(x) UIImageJPEGRepresentation(x,4)

#define imgExt @"png"
#define imageToData(x) UIImagePNGRepresentation(x)

+ (NSString *) assetCacheFolder  
    NSArray  *assetFolderRoot = NSSearchPathForDirectoriesInDomains(NSCachesDirectory, NSUserDomainMask, YES);
    return [NSString stringWithFormat:@"%@/audio", [assetFolderRoot objectAtIndex:0]];
 

+ (NSString *) cachedAudioPictogramPathForMPMediaItem:(MPMediaItem*) item 
    NSString *assetFolder = [[self class] assetCacheFolder];
    NSNumber * libraryId = [item valueForProperty:MPMediaItemPropertyPersistentID];
    NSString *assetPictogramFilename = [NSString stringWithFormat:@"asset_%@.%@",libraryId,imgExt];
    return [NSString stringWithFormat:@"%@/%@", assetFolder, assetPictogramFilename];


+ (NSString *) cachedAudioFilepathForMPMediaItem:(MPMediaItem*) item 
    NSString *assetFolder = [[self class] assetCacheFolder];

    NSURL    * assetURL = [item valueForProperty:MPMediaItemPropertyAssetURL];
    NSNumber * libraryId = [item valueForProperty:MPMediaItemPropertyPersistentID];

    NSString *assetFileExt = [[[assetURL path] lastPathComponent] pathExtension];
    NSString *assetFilename = [NSString stringWithFormat:@"asset_%@.%@",libraryId,assetFileExt];
    return [NSString stringWithFormat:@"%@/%@", assetFolder, assetFilename];


+ (NSURL *) cachedAudioURLForMPMediaItem:(MPMediaItem*) item 
    NSString *assetFilepath = [[self class] cachedAudioFilepathForMPMediaItem:item];
    return [NSURL fileURLWithPath:assetFilepath];

现在是做“生意”的init方法

- (id) initWithMPMediaItem:(MPMediaItem*) item 
           completionBlock:(void (^)(UIImage* delayedImagePreparation))completionBlock  

    NSFileManager *fman = [NSFileManager defaultManager];
    NSString *assetPictogramFilepath = [[self class] cachedAudioPictogramPathForMPMediaItem:item];

    if ([fman fileExistsAtPath:assetPictogramFilepath]) 

        NSLog(@"Returning cached waveform pictogram: %@",[assetPictogramFilepath lastPathComponent]);

        self = [self initWithContentsOfFile:assetPictogramFilepath];
        return self;
    

    NSString *assetFilepath = [[self class] cachedAudioFilepathForMPMediaItem:item];

    NSURL *assetFileURL = [NSURL fileURLWithPath:assetFilepath];

    if ([fman fileExistsAtPath:assetFilepath]) 

        NSLog(@"scanning cached audio data to create UIImage file: %@",[assetFilepath lastPathComponent]);

        [assetFileURL retain];
        [assetPictogramFilepath retain];

        [NSThread MCSM_performBlockInBackground: ^

            AVURLAsset *asset = [[AVURLAsset alloc] initWithURL:assetFileURL options:nil];
            NSData *waveFormData = [self renderPNGAudioPictogramForAsset:asset]; 

            [waveFormData writeToFile:assetPictogramFilepath atomically:YES];

            [assetFileURL release];
            [assetPictogramFilepath release];

            if (completionBlock) 

                [waveFormData retain];
                [NSThread MCSM_performBlockOnMainThread:^

                    UIImage *result = [UIImage imageWithData:waveFormData];

                    NSLog(@"returning rendered pictogram on main thread (%d bytes %@ data in UIImage %0.0f x %0.0f pixels)",waveFormData.length,[imgExt uppercaseString],result.size.width,result.size.height);

                    completionBlock(result);

                    [waveFormData release];
                ];
            
        ];

        return nil;

     else 

        NSString *assetFolder = [[self class] assetCacheFolder];

        [fman createDirectoryAtPath:assetFolder withIntermediateDirectories:YES attributes:nil error:nil];

        NSLog(@"Preparing to import audio asset data %@",[assetFilepath lastPathComponent]);

        [assetPictogramFilepath retain];
        [assetFileURL retain];

        TSLibraryImport* import = [[TSLibraryImport alloc] init];
        NSURL    * assetURL = [item valueForProperty:MPMediaItemPropertyAssetURL];

         [import importAsset:assetURL toURL:assetFileURL completionBlock:^(TSLibraryImport* import) 
            //check the status and error properties of
            //TSLibraryImport

            if (import.error) 

                NSLog (@"audio data import failed:%@",import.error);

             else
                NSLog (@"Creating waveform pictogram file: %@", [assetPictogramFilepath lastPathComponent]);
                AVURLAsset *asset = [[AVURLAsset alloc] initWithURL:assetFileURL options:nil];
                NSData *waveFormData = [self renderPNGAudioPictogramForAsset:asset]; 

                [waveFormData writeToFile:assetPictogramFilepath atomically:YES];

                if (completionBlock) 
                     [waveFormData retain];
                    [NSThread MCSM_performBlockOnMainThread:^

                        UIImage *result = [UIImage imageWithData:waveFormData];
                        NSLog(@"returning rendered pictogram on main thread (%d bytes %@ data in UIImage %0.0f x %0.0f pixels)",waveFormData.length,[imgExt uppercaseString],result.size.width,result.size.height);

                        completionBlock(result);

                        [waveFormData release];
                    ];
                
            

            [assetPictogramFilepath release];
            [assetFileURL release];

          ];

        return nil;
    

调用这个的一个例子:

-(void) importMediaItem 

    MPMediaItem* item = [self mediaItem];

    // since we will be needing this for playback, save the url to the cached audio.
    [url release];
    url = [[UIImage cachedAudioURLForMPMediaItem:item] retain];

    [waveFormImage release];

    waveFormImage = [[UIImage alloc ] initWithMPMediaItem:item completionBlock:^(UIImage* delayedImagePreparation)

        waveFormImage = [delayedImagePreparation retain];
        [self displayWaveFormImage];
    ];

    if (waveFormImage) 
        [waveFormImage retain];
        [self displayWaveFormImage];
    

平均和渲染方法的对数版本

#define absX(x) (x<0?0-x:x)
#define minMaxX(x,mn,mx) (x<=mn?mn:(x>=mx?mx:x))
#define noiseFloor (-90.0)
#define decibel(amplitude) (20.0 * log10(absX(amplitude)/32767.0))

-(UIImage *) audioImageLogGraph:(Float32 *) samples
                normalizeMax:(Float32) normalizeMax
                 sampleCount:(NSInteger) sampleCount 
                channelCount:(NSInteger) channelCount
                 imageHeight:(float) imageHeight 

    CGSize imageSize = CGSizeMake(sampleCount, imageHeight);
    UIGraphicsBeginImageContext(imageSize);
    CGContextRef context = UIGraphicsGetCurrentContext();

    CGContextSetFillColorWithColor(context, [UIColor blackColor].CGColor);
    CGContextSetAlpha(context,1.0);
    CGRect rect;
    rect.size = imageSize;
    rect.origin.x = 0;
    rect.origin.y = 0;

    CGColorRef leftcolor = [[UIColor whiteColor] CGColor];
    CGColorRef rightcolor = [[UIColor redColor] CGColor];

    CGContextFillRect(context, rect);

    CGContextSetLineWidth(context, 1.0);

    float halfGraphHeight = (imageHeight / 2) / (float) channelCount ;
    float centerLeft = halfGraphHeight;
    float centerRight = (halfGraphHeight*3) ; 
    float sampleAdjustmentFactor = (imageHeight/ (float) channelCount) / (normalizeMax - noiseFloor) / 2;

    for (NSInteger intSample = 0 ; intSample < sampleCount ; intSample ++ ) 
        Float32 left = *samples++;
        float pixels = (left - noiseFloor) * sampleAdjustmentFactor;
        CGContextMoveToPoint(context, intSample, centerLeft-pixels);
        CGContextAddLineToPoint(context, intSample, centerLeft+pixels);
        CGContextSetStrokeColorWithColor(context, leftcolor);
        CGContextStrokePath(context);

        if (channelCount==2) 
            Float32 right = *samples++;
            float pixels = (right - noiseFloor) * sampleAdjustmentFactor;
            CGContextMoveToPoint(context, intSample, centerRight - pixels);
            CGContextAddLineToPoint(context, intSample, centerRight + pixels);
            CGContextSetStrokeColorWithColor(context, rightcolor);
            CGContextStrokePath(context); 
        
    

    // Create new image
    UIImage *newImage = UIGraphicsGetImageFromCurrentImageContext();

    // Tidy up
    UIGraphicsEndImageContext();   

    return newImage;


- (NSData *) renderPNGAudioPictogramLogForAsset:(AVURLAsset *)songAsset 

    NSError * error = nil;
    AVAssetReader * reader = [[AVAssetReader alloc] initWithAsset:songAsset error:&error];
    AVAssetTrack * songTrack = [songAsset.tracks objectAtIndex:0];

    NSDictionary* outputSettingsDict = [[NSDictionary alloc] initWithObjectsAndKeys:
                                        [NSNumber numberWithInt:kAudioFormatLinearPCM],AVFormatIDKey,
                                        //     [NSNumber numberWithInt:44100.0],AVSampleRateKey, /*Not Supported*/
                                        //     [NSNumber numberWithInt: 2],AVNumberOfChannelsKey,    /*Not Supported*/

                                        [NSNumber numberWithInt:16],AVLinearPCMBitDepthKey,
                                        [NSNumber numberWithBool:NO],AVLinearPCMIsBigEndianKey,
                                        [NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey,
                                        [NSNumber numberWithBool:NO],AVLinearPCMIsNonInterleaved,
                                        nil];

    AVAssetReaderTrackOutput* output = [[AVAssetReaderTrackOutput alloc] initWithTrack:songTrack outputSettings:outputSettingsDict];

    [reader addOutput:output];
    [output release];

    UInt32 sampleRate,channelCount;

    NSArray* formatDesc = songTrack.formatDescriptions;
    for(unsigned int i = 0; i < [formatDesc count]; ++i) 
        CMAudioFormatDescriptionRef item = (CMAudioFormatDescriptionRef)[formatDesc objectAtIndex:i];
        const AudioStreamBasicDescription* fmtDesc = CMAudioFormatDescriptionGetStreamBasicDescription (item);
        if(fmtDesc ) 

            sampleRate = fmtDesc->mSampleRate;
            channelCount = fmtDesc->mChannelsPerFrame;

            //    NSLog(@"channels:%u, bytes/packet: %u, sampleRate %f",fmtDesc->mChannelsPerFrame, fmtDesc->mBytesPerPacket,fmtDesc->mSampleRate);
        
    

    UInt32 bytesPerSample = 2 * channelCount;
    Float32 normalizeMax = noiseFloor;
    NSLog(@"normalizeMax = %f",normalizeMax);
    NSMutableData * fullSongData = [[NSMutableData alloc] init];
    [reader startReading];

    UInt64 totalBytes = 0; 
    Float64 totalLeft = 0;
    Float64 totalRight = 0;
    Float32 sampleTally = 0;

    NSInteger samplesPerPixel = sampleRate / 50;

    while (reader.status == AVAssetReaderStatusReading)

        AVAssetReaderTrackOutput * trackOutput = (AVAssetReaderTrackOutput *)[reader.outputs objectAtIndex:0];
        CMSampleBufferRef sampleBufferRef = [trackOutput copyNextSampleBuffer];

        if (sampleBufferRef)
            CMBlockBufferRef blockBufferRef = CMSampleBufferGetDataBuffer(sampleBufferRef);

            size_t length = CMBlockBufferGetDataLength(blockBufferRef);
            totalBytes += length;

            NSAutoreleasePool *wader = [[NSAutoreleasePool alloc] init];

            NSMutableData * data = [NSMutableData dataWithLength:length];
            CMBlockBufferCopyDataBytes(blockBufferRef, 0, length, data.mutableBytes);

            SInt16 * samples = (SInt16 *) data.mutableBytes;
            int sampleCount = length / bytesPerSample;
            for (int i = 0; i < sampleCount ; i ++) 

                Float32 left = (Float32) *samples++;
                left = decibel(left);
                left = minMaxX(left,noiseFloor,0);
                totalLeft  += left;

                Float32 right;
                if (channelCount==2) 
                    right = (Float32) *samples++;
                    right = decibel(right);
                    right = minMaxX(right,noiseFloor,0);
                    totalRight += right;
                

                sampleTally++;

                if (sampleTally > samplesPerPixel) 

                    left  = totalLeft / sampleTally; 
                    if (left > normalizeMax) 
                        normalizeMax = left;
                    

                   // NSLog(@"left average = %f, normalizeMax = %f",left,normalizeMax);

                    [fullSongData appendBytes:&left length:sizeof(left)];

                    if (channelCount==2) 
                        right = totalRight / sampleTally; 

                        if (right > normalizeMax) 
                            normalizeMax = right;
                        

                        [fullSongData appendBytes:&right length:sizeof(right)];
                    

                    totalLeft   = 0;
                    totalRight  = 0;
                    sampleTally = 0;
                
            

           [wader drain];

            CMSampleBufferInvalidate(sampleBufferRef);
            CFRelease(sampleBufferRef);
        
    

    NSData * finalData = nil;

    if (reader.status == AVAssetReaderStatusFailed || reader.status == AVAssetReaderStatusUnknown)
        // Something went wrong. Handle it.
    

    if (reader.status == AVAssetReaderStatusCompleted)
        // You're done. It worked.

        NSLog(@"rendering output graphics using normalizeMax %f",normalizeMax);

        UIImage *test = [self audioImageLogGraph:(Float32 *) fullSongData.bytes 
                                 normalizeMax:normalizeMax 
                                  sampleCount:fullSongData.length / (sizeof(Float32) * 2) 
                                 channelCount:2
                                  imageHeight:100];

        finalData = imageToData(test);
    

    [fullSongData release];
    [reader release];

    return finalData;

比较输出

Acme Swing Company 的“Warm It Up”开始的线性图

Acme Swing Company 的“Warm It Up”开始的对数图

【讨论】:

这是一个非常完整且有用的答案。这实际上是一个临界教程,您可以考虑将其放入博客或类似的博客中。如果可以的话,我会投票给你 10 分。 是的,你真的应该写一篇教程或博客文章……还有一个示例项目;) 不知道它有多快,但它确实有效!我不得不添加/修改一些东西,因为你已经硬编码了一些假设,比如 2 声道音频。我还发现以下用于计算 samplesPerPixel 很方便(您分配的方式似乎是任意的?/50 ??) NSTimeInterval duration = (float)songAsset.duration.value/(float)songAsset.duration.timescale; NSLog(@"录制的持续时间:%f sec.", duration); 我还想为视网膜图形更新它,并可能将 UIImage* 方法变为 drawRect: 方法,并能够添加突出显示的区域。感谢您的伟大起点。 AVFoundation 和低级库对于相对缺乏经验的人来说还是有点吓人 谢谢。我已将此作为可可控件的起点,该控件添加了一些其他功能,例如显示播放进度——来源github.com/fulldecent/FDWaveformView【参考方案2】:

您应该能够从您的 sampleBuffRef 中获取音频缓冲区,然后遍历这些值以构建您的波形:

CMBlockBufferRef buffer = CMSampleBufferGetDataBuffer( sampleBufferRef );
CMItemCount numSamplesInBuffer = CMSampleBufferGetNumSamples(sampleBufferRef);
AudioBufferList audioBufferList;
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(
                                                            sampleBufferRef,
                                                            NULL,
                                                            &audioBufferList,
                                                            sizeof(audioBufferList),
                                                            NULL,
                                                            NULL,
                                                                  kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment,
                                                            &buffer
                                                            );

// this copies your audio out to a temp buffer but you should be able to iterate through this buffer instead
SInt32* readBuffer = (SInt32 *)malloc(numSamplesInBuffer * sizeof(SInt32));
memcpy( readBuffer, audioBufferList.mBuffers[0].mData, numSamplesInBuffer*sizeof(SInt32));

【讨论】:

【参考方案3】:

另一种使用 Swift 5 和 AVAudioFile 的方法:

///Gets the audio file from an URL, downsaples and draws into the sound layer.
func drawSoundWave(fromURL url:URL, fromPosition:Int64, totalSeconds:UInt32, samplesSecond:CGFloat) throws

    print("\(logClassName) Drawing sound from \(url)")

    do
        waveViewInfo.samplesSeconds = samplesSecond

        //Get audio file and format from URL
        let audioFile = try AVAudioFile(forReading: url)

        waveViewInfo.format = audioFile.processingFormat
        audioFile.framePosition = fromPosition * Int64(waveViewInfo.format.sampleRate)

        //Getting the buffer
        let frameCapacity:UInt32 = totalSeconds * UInt32(waveViewInfo.format.sampleRate)

        guard let audioPCMBuffer = AVAudioPCMBuffer(pcmFormat: waveViewInfo.format, frameCapacity: frameCapacity) else throw AppError("Unable to get the AVAudioPCMBuffer") 
        try audioFile.read(into: audioPCMBuffer, frameCount: frameCapacity)
        let audioPCMBufferFloatValues:[Float] = Array(UnsafeBufferPointer(start: audioPCMBuffer.floatChannelData?.pointee,
                                                                          count: Int(audioPCMBuffer.frameLength)))

        waveViewInfo.points = []
        waveViewInfo.maxValue = 0
        for index in stride(from: 0, to: audioPCMBufferFloatValues.count, by: Int(audioFile.fileFormat.sampleRate) / Int(waveViewInfo.samplesSeconds))

            let aSample = CGFloat(audioPCMBufferFloatValues[index])
            waveViewInfo.points.append(aSample)
            let fix = abs(aSample)
            if fix > waveViewInfo.maxValue
                waveViewInfo.maxValue = fix
            

        

        print("\(logClassName) Finished the points - Count = \(waveViewInfo.points.count) / Max = \(waveViewInfo.maxValue)")

        populateSoundImageView(with: waveViewInfo)

    
    catch

        throw error

    



///Converts the sound wave in to a UIImage
func populateSoundImageView(with waveViewInfo:WaveViewInfo)

    let imageSize:CGSize = CGSize(width: CGFloat(waveViewInfo.points.count),//CGFloat(waveViewInfo.points.count) * waveViewInfo.sampleSpace,
                                  height: frame.height)
    let drawingRect = CGRect(origin: .zero, size: imageSize)

    UIGraphicsBeginImageContextWithOptions(imageSize, false, 0)
    defer 
        UIGraphicsEndImageContext()
    
    print("\(logClassName) Converting sound view in rect \(drawingRect)")

    guard let context:CGContext = UIGraphicsGetCurrentContext() else return 

    context.setFillColor(waveViewInfo.backgroundColor.cgColor)
    context.setAlpha(1.0)
    context.fill(drawingRect)
    context.setLineWidth(1.0)
    //        context.setLineWidth(waveViewInfo.lineWidth)

    let sampleAdjustFactor = imageSize.height / waveViewInfo.maxValue
    for pointIndex in waveViewInfo.points.indices

        let pixel = waveViewInfo.points[pointIndex] * sampleAdjustFactor

        context.move(to: CGPoint(x: CGFloat(pointIndex), y: middleY - pixel))
        context.addLine(to: CGPoint(x: CGFloat(pointIndex), y: middleY + pixel))

        context.setStrokeColor(waveViewInfo.strokeColor.cgColor)
        context.strokePath()

    

     //        for pointIndex in waveViewInfo.points.indices
    //
    //            let pixel = waveViewInfo.points[pointIndex] * sampleAdjustFactor
    //
    //            context.move(to: CGPoint(x: CGFloat(pointIndex) * waveViewInfo.sampleSpace, y: middleY - pixel))
    //            context.addLine(to: CGPoint(x: CGFloat(pointIndex) * waveViewInfo.sampleSpace, y: middleY + pixel))
    //
    //            context.setStrokeColor(waveViewInfo.strokeColor.cgColor)
    //            context.strokePath()
    //
    //        

    //        var xIncrement:CGFloat = 0
    //        for point in waveViewInfo.points
    //
    //            let normalizedPoint = point * sampleAdjustFactor
    //
    //            context.move(to: CGPoint(x: xIncrement, y: middleY - normalizedPoint))
    //            context.addLine(to: CGPoint(x: xIncrement, y: middleX + normalizedPoint))
    //            context.setStrokeColor(waveViewInfo.strokeColor.cgColor)
    //            context.strokePath()
    //
    //            xIncrement += waveViewInfo.sampleSpace
    //
    //        

    guard let soundWaveImage = UIGraphicsGetImageFromCurrentImageContext() else return 

    soundWaveImageView.image = soundWaveImage
    //        //In case of handling sample space in for
    //        updateWidthConstraintValue(soundWaveImage.size.width)
    updateWidthConstraintValue(soundWaveImage.size.width * waveViewInfo.sampleSpace)


在哪里

class WaveViewInfo 

    var format:AVAudioFormat!
    var samplesSeconds:CGFloat = 50
    var lineWidth:CGFloat = 0.20
    var sampleSpace:CGFloat = 0.20

    var strokeColor:UIColor = .red
    var backgroundColor:UIColor = .clear

    var maxValue:CGFloat = 0
    var points:[CGFloat] = [CGFloat]()


目前只打印一个声波,但可以扩展。好的部分是您可以按部分打印音轨

【讨论】:

实时音频流怎么样? 方法不同。您最好的方法是填充数据缓冲区并绘制它,但它是我的客人。【参考方案4】:

对上述答案进行一点重构(使用 AVAudioFile)


import AVFoundation
import CoreGraphics
import Foundation
import UIKit

class WaveGenerator 
    private func readBuffer(_ audioUrl: URL) -> UnsafeBufferPointer<Float> 
        let file = try! AVAudioFile(forReading: audioUrl)

        let audioFormat = file.processingFormat
        let audioFrameCount = UInt32(file.length)
        guard let buffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: audioFrameCount)
        else  return UnsafeBufferPointer<Float>(_empty: ()) 
        do 
            try file.read(into: buffer)
         catch 
            print(error)
        

//        let floatArray = Array(UnsafeBufferPointer(start: buffer.floatChannelData![0], count: Int(buffer.frameLength)))
        let floatArray = UnsafeBufferPointer(start: buffer.floatChannelData![0], count: Int(buffer.frameLength))

        return floatArray
    

    private func generateWaveImage(
        _ samples: UnsafeBufferPointer<Float>,
        _ imageSize: CGSize,
        _ strokeColor: UIColor,
        _ backgroundColor: UIColor
    ) -> UIImage? 
        let drawingRect = CGRect(origin: .zero, size: imageSize)

        UIGraphicsBeginImageContextWithOptions(imageSize, false, 0)

        let middleY = imageSize.height / 2

        guard let context: CGContext = UIGraphicsGetCurrentContext() else  return nil 

        context.setFillColor(backgroundColor.cgColor)
        context.setAlpha(1.0)
        context.fill(drawingRect)
        context.setLineWidth(0.25)

        let max: CGFloat = CGFloat(samples.max() ?? 0)
        let heightNormalizationFactor = imageSize.height / max / 2
        let widthNormalizationFactor = imageSize.width / CGFloat(samples.count)
        for index in 0 ..< samples.count 
            let pixel = CGFloat(samples[index]) * heightNormalizationFactor

            let x = CGFloat(index) * widthNormalizationFactor

            context.move(to: CGPoint(x: x, y: middleY - pixel))
            context.addLine(to: CGPoint(x: x, y: middleY + pixel))

            context.setStrokeColor(strokeColor.cgColor)
            context.strokePath()
        
        guard let soundWaveImage = UIGraphicsGetImageFromCurrentImageContext() else  return nil 

        UIGraphicsEndImageContext()
        return soundWaveImage
    

    func generateWaveImage(from audioUrl: URL, in imageSize: CGSize) -> UIImage? 
        let samples = readBuffer(audioUrl)
        let img = generateWaveImage(samples, imageSize, UIColor.blue, UIColor.white)
        return img
    

用法

let url = Bundle.main.url(forResource: "TEST1.mp3", withExtension: "")!
let img = waveGenerator.generateWaveImage(from: url, in: CGSize(width: 600, height: 200))

【讨论】:

以上是关于使用 AVAssetReader 绘制波形的主要内容,如果未能解决你的问题,请参考以下文章

在 AVAssetReader 中设置时间范围会导致冻结

FFMPEG 音频解码和绘制波形

如何绘制波形?

绘制音频的波形图

在Android中绘制wav音频的波形

C#使用chart绘制实时折线图,波形图叠加