如何在 iOS 中将 2 个单声道文件转换为一个立体声文件？

Posted 2023-02-25

技术标签:

【中文标题】如何在 iOS 中将 2 个单声道文件转换为一个立体声文件？【英文标题】：How to convert 2 mono files into a single stereo file in iOS? 【发布时间】：2017-02-14 21:04:08 【问题描述】：

我正在尝试将本地的 2 个 CAF 文件转换为一个文件。这 2 个 CAF 文件是单声道流，理想情况下，我希望它们是立体声文件，这样我就可以使用一个通道的麦克风和另一个通道的扬声器。

我最初是使用 AVAssetTrack 和 AVMutableCompositionTracks 开始的，但是我无法解决混音问题。我的合并文件是交错两个文件的单个单声道流。所以我选择了 AVAudioEngine 路线。

据我了解，我可以将我的两个文件作为输入节点传入，将它们附加到混音器，并拥有一个能够获得立体声混音的输出节点。输出文件具有立体声布局，但似乎没有写入音频数据，因为我可以在 Audacity 中打开它并查看立体声布局。在 installTapOnBus 调用周围放置一个 dipatch sephamore 信号也没有多大帮助。任何见解都将不胜感激，因为 CoreAudio 一直难以理解。

// obtain path of microphone and speaker files
NSString *micPath = [[NSBundle mainBundle] pathForResource:@"microphone" ofType:@"caf"];
NSString *spkPath = [[NSBundle mainBundle] pathForResource:@"speaker" ofType:@"caf"];
NSURL *micURL = [NSURL fileURLWithPath:micPath];
NSURL *spkURL = [NSURL fileURLWithPath:spkPath];

// create engine
AVAudioEngine *engine = [[AVAudioEngine alloc] init];

AVAudioFormat *stereoFormat = [[AVAudioFormat alloc] initStandardFormatWithSampleRate:16000 channels:2];

AVAudioMixerNode *mainMixer = engine.mainMixerNode;

// create audio files
AVAudioFile *audioFile1 = [[AVAudioFile alloc] initForReading:micURL error:nil];
AVAudioFile *audioFile2 = [[AVAudioFile alloc] initForReading:spkURL error:nil];

// create player input nodes
AVAudioPlayerNode *apNode1 = [[AVAudioPlayerNode alloc] init];
AVAudioPlayerNode *apNode2 = [[AVAudioPlayerNode alloc] init];

// attach nodes to the engine
[engine attachNode:apNode1];
[engine attachNode:apNode2];

// connect player nodes to engine's main mixer
stereoFormat = [mainMixer outputFormatForBus:0];
[engine connect:apNode1 to:mainMixer fromBus:0 toBus:0 format:audioFile1.processingFormat];
[engine connect:apNode2 to:mainMixer fromBus:0 toBus:1 format:audioFile2.processingFormat];
[engine connect:mainMixer to:engine.outputNode format:stereoFormat];

// start the engine
NSError *error = nil;
if(![engine startAndReturnError:&error])
    NSLog(@"Engine failed to start.");


// create output file
NSString *mergedAudioFile = [[micPath stringByDeletingLastPathComponent] stringByAppendingPathComponent:@"merged.caf"];
[[NSFileManager defaultManager] removeItemAtPath:mergedAudioFile error:&error];
NSURL *mergedURL = [NSURL fileURLWithPath:mergedAudioFile];
AVAudioFile *outputFile = [[AVAudioFile alloc] initForWriting:mergedURL settings:[engine.inputNode inputFormatForBus:0].settings error:&error];

// write from buffer to output file
[mainMixer installTapOnBus:0 bufferSize:4096 format:[mainMixer outputFormatForBus:0] block:^(AVAudioPCMBuffer *buffer, AVAudioTime *when)
    NSError *error;
    BOOL success;
    NSLog(@"Writing");
    if((outputFile.length < audioFile1.length) || (outputFile.length < audioFile2.length))
        success = [outputFile writeFromBuffer:buffer error:&error];
        NSCAssert(success, @"error writing buffer data to file, %@", [error localizedDescription]);
        if(error)
            NSLog(@"Error: %@", error);
        
    
    else
        [mainMixer removeTapOnBus:0];
        NSLog(@"Done writing");
    
];

【问题讨论】：

您是否对正在写入的 AVAudioFile 进行了强烈引用？ @Dave，输出文件在被写入之前不存在。在强引用方面，我将audioFile设置为写入mergedURL，即mergedAudioFile的fileURLWithPath。没有引用 outputFile 的其他对象/变量，我不会在 installTapOnBus 调用后销毁它。这种方法的一个缺点是您必须等待文件的持续时间才能将它们呈现为一个。话虽如此，如果您坚持使用 AVAudioEngine，您可能会尝试先播放这两个文件。然后，一旦该步骤完成，安装水龙头并写入文件。但如果我自己做，我会使用 C API。我实际上并不想让文件在手机上播放。我只想要一个 outputFile 来包含立体声数据，如果需要的话，在 Audacity 中播放它。将 dispatch_sephamore 包裹在该呼叫周围会有所帮助吗？我会再试一次。我知道如果我要使用 C，我必须自己操作缓冲区。尽管目前我不确定如何从输入音频文件中提取缓冲区。我看到我可以利用这个答案 - ***.com/questions/6292905/mono-to-stereo-conversion 来获取我的输出缓冲区，但我担心标题。您应该使用 ExtAudioFile 来读取和写入文件。 【参考方案1】：

使用ExtAudioFile 执行此操作涉及三个文件和三个缓冲区。两个单声道用于阅读，一个立体声用于写作。在一个循环中，每个单声道文件将一小段音频读取到其单声道输出缓冲区，然后复制到立体声缓冲区的正确“一半”中。然后在立体声缓冲区充满数据的情况下，将该缓冲区写入输出文件，重复直到两个单声道文件都完成读取（如果一个单声道文件比另一个文件长，则写入零）。

对我来说最有问题的地方是正确的文件格式，core-audio 需要非常具体的格式。幸运的是，AVAudioFormat 的存在是为了简化一些常见格式的创建。

每个音频文件读取器/写入器都有两种格式，一种表示数据存储的格式 (file_format)，另一种表示读取器/写入器的输入/输出格式 (client_format)。读取器/写入器内置格式转换器，以防格式不同。

这是一个例子：

-(void)soTest


    //This is what format the readers will output
    AVAudioFormat *monoClienFormat = [[AVAudioFormat alloc]initWithCommonFormat:AVAudioPCMFormatInt16 sampleRate:44100.0 channels:1 interleaved:0];

    //This is the format the writer will take as input
    AVAudioFormat *stereoClientFormat = [[AVAudioFormat alloc]initWithCommonFormat:AVAudioPCMFormatInt16 sampleRate:44100 channels:2 interleaved:0];

    //This is the format that will be written to storage.  It must be interleaved.
    AVAudioFormat *stereoFileFormat = [[AVAudioFormat alloc]initWithCommonFormat:AVAudioPCMFormatInt16 sampleRate:44100 channels:2 interleaved:1];




    NSURL *leftURL = [NSBundle.mainBundle URLForResource:@"left" withExtension:@"wav"];
    NSURL *rightURL = [NSBundle.mainBundle URLForResource:@"right" withExtension:@"wav"];

    NSString *stereoPath = [documentsDir() stringByAppendingPathComponent:@"stereo.wav"];
    NSURL *stereoURL = [NSURL URLWithString:stereoPath];

    ExtAudioFileRef leftReader;
    ExtAudioFileRef rightReader;
    ExtAudioFileRef stereoWriter;


    OSStatus status = 0;

    //Create readers and writer
    status = ExtAudioFileOpenURL((__bridge CFURLRef)leftURL, &leftReader);
    if(status)printf("error %i",status);//All the ExtAudioFile functins return a non-zero status if there's an error, I'm only checking one to demonstrate, but you should be checking all the ExtAudioFile function returns.
    ExtAudioFileOpenURL((__bridge CFURLRef)rightURL, &rightReader);
    //Here the file format is set to stereo interleaved.
    ExtAudioFileCreateWithURL((__bridge CFURLRef)stereoURL, kAudioFileCAFType, stereoFileFormat.streamDescription, nil, kAudioFileFlags_EraseFile, &stereoWriter);


    //Set client format for readers and writer
    ExtAudioFileSetProperty(leftReader, kExtAudioFileProperty_ClientDataFormat, sizeof(AudiostreamBasicDescription), monoClienFormat.streamDescription);
    ExtAudioFileSetProperty(rightReader, kExtAudioFileProperty_ClientDataFormat, sizeof(AudioStreamBasicDescription), monoClienFormat.streamDescription);
    ExtAudioFileSetProperty(stereoWriter, kExtAudioFileProperty_ClientDataFormat, sizeof(AudioStreamBasicDescription), stereoClientFormat.streamDescription);


    int framesPerRead = 4096;
    int bufferSize = framesPerRead * sizeof(SInt16);

    //Allocate memory for the buffers
    AudioBufferList *leftBuffer = createBufferList(bufferSize,1);
    AudioBufferList *rightBuffer = createBufferList(bufferSize,1);
    AudioBufferList *stereoBuffer = createBufferList(bufferSize,2);

    //ExtAudioFileRead takes an ioNumberFrames argument.  On input the number of frames you want, on otput it's the number of frames you got.  0 means your done.
    UInt32 leftFramesIO = framesPerRead;
    UInt32 rightFramesIO = framesPerRead;



    while (leftFramesIO || rightFramesIO) 
        if (leftFramesIO)
            //If frames to read is less than a full buffer, zero out the remainder of the buffer
            int framesRemaining = framesPerRead - leftFramesIO;
            if (framesRemaining)
                memset(((SInt16 *)leftBuffer->mBuffers[0].mData) + framesRemaining, 0, sizeof(SInt16) * framesRemaining);
            
            //Read into left buffer
            leftBuffer->mBuffers[0].mDataByteSize = leftFramesIO * sizeof(SInt16);
            ExtAudioFileRead(leftReader, &leftFramesIO, leftBuffer);
        
        else
            //set to zero if no more frames to read
            memset(leftBuffer->mBuffers[0].mData, 0, sizeof(SInt16) * framesPerRead);
        

        if (rightFramesIO)
            int framesRemaining = framesPerRead - rightFramesIO;
            if (framesRemaining)
                memset(((SInt16 *)rightBuffer->mBuffers[0].mData) + framesRemaining, 0, sizeof(SInt16) * framesRemaining);
            
            rightBuffer->mBuffers[0].mDataByteSize = rightFramesIO * sizeof(SInt16);
            ExtAudioFileRead(rightReader, &rightFramesIO, rightBuffer);
        
        else
            memset(rightBuffer->mBuffers[0].mData, 0, sizeof(SInt16) * framesPerRead);
        


        UInt32 stereoFrames = MAX(leftFramesIO, rightFramesIO);

        //copy left to stereoLeft and right to stereoRight
        memcpy(stereoBuffer->mBuffers[0].mData, leftBuffer->mBuffers[0].mData, sizeof(SInt16) * stereoFrames);
        memcpy(stereoBuffer->mBuffers[1].mData, rightBuffer->mBuffers[0].mData, sizeof(SInt16) * stereoFrames);

        //write to file
        stereoBuffer->mBuffers[0].mDataByteSize = stereoFrames * sizeof(SInt16);
        stereoBuffer->mBuffers[1].mDataByteSize = stereoFrames * sizeof(SInt16);
        ExtAudioFileWrite(stereoWriter, stereoFrames, stereoBuffer);

    

    ExtAudioFileDispose(leftReader);
    ExtAudioFileDispose(rightReader);
    ExtAudioFileDispose(stereoWriter);

    freeBufferList(leftBuffer);
    freeBufferList(rightBuffer);
    freeBufferList(stereoBuffer);



AudioBufferList *createBufferList(int bufferSize, int numberBuffers)
    assert(bufferSize > 0 && numberBuffers > 0);
    int bufferlistByteSize = sizeof(AudioBufferList);
    bufferlistByteSize += sizeof(AudioBuffer) * (numberBuffers - 1);
    AudioBufferList *bufferList = malloc(bufferlistByteSize);
    bufferList->mNumberBuffers = numberBuffers;
    for (int i = 0; i < numberBuffers; i++) 
        bufferList->mBuffers[i].mNumberChannels = 1;
        bufferList->mBuffers[i].mData = malloc(bufferSize);
    
    return bufferList;
;
void freeBufferList(AudioBufferList *bufferList)
    for (int i = 0; i < bufferList->mNumberBuffers; i++) 
        free(bufferList->mBuffers[i].mData);
    
    free(bufferList);

NSString *documentsDir()
    static NSString *path = NULL;
    if(!path)
        path = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, 1).firstObject;
    
    return path;

【讨论】：

我得到了一个立体声文件，每个通道都没有输出。输入单声道文件是 CAF 类型的，但我不希望格式有太大偏差。您是否检查了所有 ExtAudioFile 返回值？是的，注意到问题在于 EAF 输出文件的创建。我传入的 url 是扩展名 - “.caf”，与您的“.wav”相比。给我一个 OSStatus 错误 1718449215，它指的是 kAudioFormatUnsupportedDataFormatError。将其更改为 kAudioFormatLinearPCM 也没有发生，即使这是我之前指定的输出格式，当我能够从交错单声道文件生成交错立体声文件时。它应该适用于 caf 和 wav。确保您使用的是交错格式（例如，stereoFileFormat）ExtAudioFileCreateWithURL。它会因非交错而失败。

以上是关于如何在 iOS 中将 2 个单声道文件转换为一个立体声文件？的主要内容，如果未能解决你的问题，请参考以下文章