使用 AudioUnit IOS 从 nsdata 的服务器流播放语音

Posted 2023-02-25

技术标签:

【中文标题】使用 AudioUnit IOS 从 nsdata 的服务器流播放语音【英文标题】：Playing voice from server stream of nsdata using AudioUnit IOS 【发布时间】：2014-08-11 15:21:33 【问题描述】：

我正在尝试在 ios 中构建某种 VoIP 应用程序。到目前为止，我已经能够使用GCDAsyncSocket 成功地将麦克风数据作为缓冲区从麦克风发送到服务器。现在我需要回放我收到的数据，这让我很困惑。我在网上看过，但我看到的只是从远程播放音频文件或从 URL 播放音频流。我实际上经常收到NSData，需要弄清楚如何使用那些NSData 来填充音频单元缓冲区列表。我是 C 的新手，发现很难通过它。这是我从服务器获得NSData 的地方。

- (void)socket:(GCDAsyncSocket *)sender didReadData:(NSData *)data withTag:(long)tag

    if (tag == 1 )
       //this is where I read password and stuff to authenticate

    
    else

        [self setUpAQOutput:data];//this should somehow initialize AU and fill the buffer

在我的AudioUnitProcessor 中，这就是我使用Stefan Popp's codes 设置AUnit 的方式：

  //
//  AudioProcessor.m
//  MicInput
//
//  Created by Stefan Popp on 21.09.11.

//

#import "AudioProcessor.h"
#import "PTTClient.h"
#pragma mark Recording callback

static OSStatus recordingCallback(void *inRefCon,
                                  AudioUnitRenderActionFlags *ioActionFlags,
                                  const AudioTimeStamp *inTimeStamp,
                                  UInt32 inBusNumber,
                                  UInt32 inNumberFrames,
                                  AudioBufferList *ioData) 

    // the data gets rendered here
    AudioBuffer buffer;

    // a variable where we check the status
    OSStatus status;

    /**
     This is the reference to the object who owns the callback.
     */
    AudioProcessor *audioProcessor = (AudioProcessor*) inRefCon;

    /**
     on this point we define the number of channels, which is mono
     for the iphone. the number of frames is usally 512 or 1024.
     */
    buffer.mDataByteSize = inNumberFrames * 2; // sample size
    buffer.mNumberChannels = 1; // one channel
    buffer.mData = malloc( inNumberFrames * 2 ); // buffer size

    // we put our buffer into a bufferlist array for rendering
    AudioBufferList bufferList;
    bufferList.mNumberBuffers = 1;
    bufferList.mBuffers[0] = buffer;

    // render input and check for error
    status = AudioUnitRender([audioProcessor audioUnit], ioActionFlags, inTimeStamp, inBusNumber, inNumberFrames, &bufferList);


    // process the bufferlist in the audio processor
    [audioProcessor processBuffer:&bufferList];

    // clean up the buffer
    free(bufferList.mBuffers[0].mData);

    return noErr;


#pragma mark Playback callback

static OSStatus playbackCallback(void *inRefCon,
                                 AudioUnitRenderActionFlags *ioActionFlags,
                                 const AudioTimeStamp *inTimeStamp,
                                 UInt32 inBusNumber,
                                 UInt32 inNumberFrames,
                                 AudioBufferList *ioData) 

//does nothing
    return noErr;



#pragma mark objective-c class

@implementation AudioProcessor
@synthesize audioUnit, inAudioBuffer;

-(AudioProcessor*)init

    self = [super init];
    if (self) 
        [self initializeAudio];
    
    return self;


+ (OSStatus) playBytes:(NSArray*) byteArray 

    /**
     This is the reference to the object who owns the callback.
     */
  //  NSArray * byteArray = nil;
    AudioProcessor *audioProcessor = [[AudioProcessor alloc] init];

    // iterate over incoming stream an copy to output stream
    for (int i=0; i < [byteArray count]; i++) 
    //  AudioBuffer buffer = ioData->mBuffers[i];

        // find minimum size

        UInt32 size =  [audioProcessor inAudioBuffer].mDataByteSize;

        // copy buffer to audio buffer which gets played after function return
        memcpy(byteArray[i], [audioProcessor inAudioBuffer].mData, size);

        // set data size
        //buffer.mDataByteSize = size;
    
    return noErr;


-(void)initializeAudio

    OSStatus status;

    // We define the audio component
    AudioComponentDescription desc;
    desc.componentType = kAudioUnitType_Output; // we want to ouput
    desc.componentSubType = kAudioUnitSubType_RemoteIO; // we want in and ouput
    desc.componentFlags = 0; // must be zero
    desc.componentFlagsMask = 0; // must be zero
    desc.componentManufacturer = kAudioUnitManufacturer_Apple; // select provider

    // find the AU component by description
    AudioComponent inputComponent = AudioComponentFindNext(NULL, &desc);

    // create audio unit by component
    status = AudioComponentInstanceNew(inputComponent, &audioUnit);

        // define that we want record io on the input bus
    UInt32 flag = 1;
    status = AudioUnitSetProperty(audioUnit,
                                  kAudioOutputUnitProperty_EnableIO, // use io
                                  kAudioUnitScope_Input, // scope to input
                                  kInputBus, // select input bus (1)
                                  &flag, // set flag
                                  sizeof(flag));
        // define that we want play on io on the output bus
    UInt32 stopFlag = 0;//stop flag 0 because we dont want to play audio back in device
    status = AudioUnitSetProperty(audioUnit,
                                  kAudioOutputUnitProperty_EnableIO, // use io
                                  kAudioUnitScope_Output, // scope to output
                                  kOutputBus, // select output bus (0)
                                  &stopFlag, // set flag
                                  sizeof(stopFlag));

    /*
     We need to specify our format on which we want to work.
     We use Linear PCM cause its uncompressed and we work on raw data.
     for more informations check.

     We want 16 bits, 2 bytes per packet/frames at 44khz
     */
    AudioStreamBasicDescription audioFormat;
    audioFormat.mSampleRate         = SAMPLE_RATE;
    audioFormat.mFormatID           = kAudioFormatLinearPCM;
    audioFormat.mFormatFlags        = kAudioFormatFlagIsPacked | kAudioFormatFlagIsSignedInteger;
    audioFormat.mFramesPerPacket    = 1;
    audioFormat.mChannelsPerFrame   = 1;
    audioFormat.mBitsPerChannel     = 16;
    audioFormat.mBytesPerPacket     = audioFormat.mChannelsPerFrame * sizeof( SInt16);
    audioFormat.mBytesPerFrame      = audioFormat.mChannelsPerFrame * sizeof( SInt16);



    // set the format on the output stream
    status = AudioUnitSetProperty(audioUnit,
                                  kAudioUnitProperty_StreamFormat,
                                  kAudioUnitScope_Output,
                                  kInputBus,
                                  &audioFormat,
                                  sizeof(audioFormat));


    // set the format on the input stream
    status = AudioUnitSetProperty(audioUnit,
                                  kAudioUnitProperty_StreamFormat,
                                  kAudioUnitScope_Input,
                                  kOutputBus,
                                  &audioFormat,
                                  sizeof(audioFormat));


    /**
     We need to define a callback structure which holds
     a pointer to the recordingCallback and a reference to
     the audio processor object
     */
    AURenderCallbackStruct callbackStruct;

    // set recording callback
    callbackStruct.inputProc = recordingCallback; // recordingCallback pointer
    callbackStruct.inputProcRefCon = self;

    // set input callback to recording callback on the input bus
    status = AudioUnitSetProperty(audioUnit,
                                  kAudioOutputUnitProperty_SetInputCallback,
                                  kAudioUnitScope_Global,
                                  kInputBus,
                                  &callbackStruct,
                                  sizeof(callbackStruct));

     /*
     We do the same on the output stream to hear what is coming
     from the input stream
     */
    callbackStruct.inputProc = playbackCallback;
    callbackStruct.inputProcRefCon = self;

    // set playbackCallback as callback on our renderer for the output bus
    status = AudioUnitSetProperty(audioUnit,
                                  kAudioUnitProperty_SetRenderCallback,
                                  kAudioUnitScope_Global,
                                  kOutputBus,
                                  &callbackStruct,
                                  sizeof(callbackStruct));

    // reset flag to 0
    flag = 0;

    /*
     we need to tell the audio unit to allocate the render buffer,
     that we can directly write into it.
     */
    status = AudioUnitSetProperty(audioUnit,
                                  kAudioUnitProperty_ShouldAllocateBuffer,
                                  kAudioUnitScope_Output,
                                  kInputBus,
                                  &flag,
                                  sizeof(flag));


    /*
     we set the number of channels to mono and allocate our block size to
     1024 bytes.
     */
    inAudioBuffer.mNumberChannels = 1;
    inAudioBuffer.mDataByteSize = 512 * 2;
    inAudioBuffer.mData = malloc( 512 * 2 );

    // Initialize the Audio Unit and cross fingers =)
    status = AudioUnitInitialize(audioUnit);

    NSLog(@"Started");



#pragma mark controll stream

-(void)start;

    // start the audio unit. You should hear something, hopefully :)
    OSStatus status = AudioOutputUnitStart(audioUnit);
   
-(void)stop;

    // stop the audio unit
    OSStatus status = AudioOutputUnitStop(audioUnit);



#pragma mark processing

-(void)processBuffer: (AudioBufferList*) audioBufferList

    AudioBuffer sourceBuffer = audioBufferList->mBuffers[0];

    // we check here if the input data byte size has changed
    if (inAudioBuffer.mDataByteSize != sourceBuffer.mDataByteSize) 
        // clear old buffer
        free(inAudioBuffer.mData);
        // assing new byte size and allocate them on mData
        inAudioBuffer.mDataByteSize = sourceBuffer.mDataByteSize;
        inAudioBuffer.mData = malloc(sourceBuffer.mDataByteSize);
    
    int currentBuffer =0;
    int maxBuf = 800;

    NSMutableData *data=[[NSMutableData alloc] init];
    // CMBlockBufferRef blockBuffer;
    // CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(ref, NULL, &audioBufferList, sizeof(audioBufferList), NULL, NULL, 0, &blockBuffer);
    // NSLog(@"%@",blockBuffer);


    // audioBufferList->mBuffers[0].mData, audioBufferList->mBuffers[0].mDataByteSize

    for( int y=0; y<audioBufferList->mNumberBuffers; y++ )
    
        if (currentBuffer < maxBuf)
            AudioBuffer audioBuff = audioBufferList->mBuffers[y];
            Float32 *frame = (Float32*)audioBuff.mData;


            [data appendBytes:frame length:inAudioBuffer.mDataByteSize];
            currentBuffer += audioBuff.mDataByteSize;
        
        else
            break;
        

    

    [[PTTClient getDefaultInstance] setAudioBufferData: data];//This is call to send buffer data to the server

    // copy incoming audio data to the audio buffer (no need since we are not using playback)
    //memcpy(inAudioBuffer.mData, audioBufferList->mBuffers[0].mData, audioBufferList->mBuffers[0].mDataByteSize);





@end

最后这是向服务器发送音频数据的方法

-(void) setAudioBufferData: (NSData*) data
   [gcdSocket writeData:data withTimeout:timeout tag:tag];

所有这些工作都很好，我可以在我的服务器中聆听以 Java 运行的声音。现在我需要弄清楚如何调整这个音频单元来播放我不断从服务器接收到的NSData数据包（我看过一些播放远程文件的例子，这不是我需要的。我需要播放嗓音）。来源不是文件，而是有人在说话，所以我有点困惑。

【问题讨论】：

这不是“C”。 “Objective-C”与“C”本身有很大不同。好的，你能建议实现语音播放的最佳方法吗？你实现了这个如果是请分享我有同样问题的解决方案 【参考方案1】：

1) 音频数据可能会变大，所以我会对其进行文件缓冲。不确定更优雅的方式，但是嘿...在 C++ 级别逻辑运行的代码中的蛮力方法让我有点害怕...

正在使用

[NSData writeToFile:atomically:];

...有帮助吗？也许然后使用该文件作为传递给一些更方便的核心框架的音频源？

2) 唯一想到的另一件事是某种形式的本地套接字。即打开与自己的连接并将其作为“远程源”提供

很抱歉，希望我知道更多以提供更多帮助。

【讨论】：

所以每次我得到 NSData 时，我应该将它附加到文件中吗？您能否建议我一些更方便的带有示例代码的核心音频框架……我对此很陌生。我确信一定有比我一直在做的更优雅的方式。 ForPretty-Twilio，但它是 PaidPerMinute.FromPersonalExperienceInCoreAudioPlayingSound 作为背景/VoIP 级别WillBeHairy.WhatIsYourIntendedAudience？CanYourJavaServerEncode HLS？这是一个难题...当前的 sdk 也很困难...因此必须有一些东西来弥补这一点 @jplego 我想播放来自套接字数据的声音，所以请您提供您使用的任何示例代码，因为我做了很多研究但没有找到任何好的解决方案。

以上是关于使用 AudioUnit IOS 从 nsdata 的服务器流播放语音的主要内容，如果未能解决你的问题，请参考以下文章