如何通过 Objective-C 中的 Speech 框架实现语音到文本？

Posted 2023-04-19

技术标签:

【中文标题】如何通过 Objective-C 中的 Speech 框架实现语音到文本？【英文标题】：How to implement speech-to-text via the Speech framework in Objective-C? 【发布时间】：2017-10-05 15:46:55 【问题描述】：

我想在我的 Objective-C 应用程序中使用 ios 语音框架进行语音识别。

我找到了一些 Swift 示例，但在 Objective-C 中找不到任何东西。

是否可以从 Objective-C 访问这个框架？如果有，怎么做？

【问题讨论】：

【参考方案1】：

在花了足够的时间寻找 Objective-C 示例之后——即使在 Apple 文档中——我也找不到任何像样的东西，所以我自己想通了。

头文件（.h）

/*!
 * Import the Speech framework, assign the Delegate and declare variables
 */

#import <Speech/Speech.h>

@interface ViewController : UIViewController <SFSpeechRecognizerDelegate> 
    SFSpeechRecognizer *speechRecognizer;
    SFSpeechAudioBufferRecognitionRequest *recognitionRequest;
    SFSpeechRecognitionTask *recognitionTask;
    AVAudioEngine *audioEngine;

方法文件 (.m)

- (void)viewDidLoad 
    [super viewDidLoad];

    // Initialize the Speech Recognizer with the locale, couldn't find a list of locales
    // but I assume it's standard UTF-8 https://wiki.archlinux.org/index.php/locale
    speechRecognizer = [[SFSpeechRecognizer alloc] initWithLocale:[[NSLocale alloc] initWithLocaleIdentifier:@"en_US"]];

    // Set speech recognizer delegate
    speechRecognizer.delegate = self;

    // Request the authorization to make sure the user is asked for permission so you can
    // get an authorized response, also remember to change the .plist file, check the repo's
    // readme file or this project's info.plist
    [SFSpeechRecognizer requestAuthorization:^(SFSpeechRecognizerAuthorizationStatus status) 
        switch (status) 
            case SFSpeechRecognizerAuthorizationStatusAuthorized:
                NSLog(@"Authorized");
                break;
            case SFSpeechRecognizerAuthorizationStatusDenied:
                NSLog(@"Denied");
                break;
            case SFSpeechRecognizerAuthorizationStatusNotDetermined:
                NSLog(@"Not Determined");
                break;
            case SFSpeechRecognizerAuthorizationStatusRestricted:
                NSLog(@"Restricted");
                break;
            default:
                break;
        
    ];



/*!
 * @brief Starts listening and recognizing user input through the 
 * phone's microphone
 */

- (void)startListening 

    // Initialize the AVAudioEngine
    audioEngine = [[AVAudioEngine alloc] init];

    // Make sure there's not a recognition task already running
    if (recognitionTask) 
        [recognitionTask cancel];
        recognitionTask = nil;
    

    // Starts an AVAudio Session
    NSError *error;
    AVAudioSession *audioSession = [AVAudioSession sharedInstance];
    [audioSession setCategory:AVAudioSessionCategoryRecord error:&error];
    [audioSession setActive:YES withOptions:AVAudioSessionSetActiveOptionNotifyOthersOnDeactivation error:&error];

    // Starts a recognition process, in the block it logs the input or stops the audio
    // process if there's an error.
    recognitionRequest = [[SFSpeechAudioBufferRecognitionRequest alloc] init];
    AVAudioInputNode *inputNode = audioEngine.inputNode;
    recognitionRequest.shouldReportPartialResults = YES;
    recognitionTask = [speechRecognizer recognitionTaskWithRequest:recognitionRequest resultHandler:^(SFSpeechRecognitionResult * _Nullable result, NSError * _Nullable error) 
        BOOL isFinal = NO;
        if (result) 
            // Whatever you say in the microphone after pressing the button should be being logged
            // in the console.
            NSLog(@"RESULT:%@",result.bestTranscription.formattedString);
            isFinal = !result.isFinal;
        
        if (error) 
            [audioEngine stop];
            [inputNode removeTapOnBus:0];
            recognitionRequest = nil;
            recognitionTask = nil;
        
    ];

    // Sets the recording format
    AVAudioFormat *recordingFormat = [inputNode outputFormatForBus:0];
    [inputNode installTapOnBus:0 bufferSize:1024 format:recordingFormat block:^(AVAudioPCMBuffer * _Nonnull buffer, AVAudioTime * _Nonnull when) 
        [recognitionRequest appendAudioPCMBuffer:buffer];
    ];

    // Starts the audio engine, i.e. it starts listening.
    [audioEngine prepare];
    [audioEngine startAndReturnError:&error];
    NSLog(@"Say Something, I'm listening"); 


- (IBAction)microPhoneTapped:(id)sender 
    if (audioEngine.isRunning) 
        [audioEngine stop];
        [recognitionRequest endAudio];
     else 
        [self startListening];

现在，添加代理 SFSpeechRecognizerDelegate 以检查语音识别器是否可用。

#pragma mark - SFSpeechRecognizerDelegate Delegate Methods

- (void)speechRecognizer:(SFSpeechRecognizer *)speechRecognizer availabilityDidChange:(BOOL)available 
    NSLog(@"Availability:%d",available);

说明和注意事项

记得修改.plist文件来获得用户对语音识别和使用麦克风的授权，当然<String>的值必须根据你的需要定制，你可以通过创建和修改@987654327中的值来做到这一点@ 或右键单击 .plist 文件和 Open As -> Source Code 并将以下行粘贴到 </dict> 标记之前。

<key>NSMicrophoneUsageDescription</key>  <string>This app uses your microphone to record what you say, so watch what you say!</string>

<key>NSSpeechRecognitionUsageDescription</key>  <string>This app uses Speech recognition to transform your spoken words into text and then analyze the, so watch what you say!.</string>

还请记住，为了能够将 Speech 框架导入到项目中，您需要拥有 iOS 10.0+。

要运行并测试它，您只需要一个非常基本的 UI，只需创建一个 UIButton 并为其分配 microPhoneTapped 操作，当按下时，应用程序应该开始收听并将通过麦克风听到的所有内容记录到控制台（在示例代码中 NSLog 是唯一接收文本的东西）。再次按下它应该会停止录制。

我用示例项目创建了Github repo，享受吧！

【讨论】：

在此代码中，有时无法达到高音频记录关闭应用程序

以上是关于如何通过 Objective-C 中的 Speech 框架实现语音到文本？的主要内容，如果未能解决你的问题，请参考以下文章