iOS-语音识别

作者: 神SKY | 来源:发表于2019-07-04 10:33 被阅读0次

iOS-语音识别
智能语音客服服务助手
AI语音基本原理
语音识别竞争激烈超乎想象！亚马逊崛起与微软衰落形成巨大反差
语音识别中英文术语
帮你进一步了解智能语音识别技术
iOS-车牌识别-EasyPR简单集成
语音识别技术基础理解
NLP
TransWAI：高效实现语音转文字，减少视频翻译周期

前言

语音识别现在已经在开发中越来越常见了，科大讯飞、百度等第三方库层出不穷，在这里简单的介绍一下iOS原生的语音识别该怎么做。其原理就是录音，然后把录音数据传给iOS内部的语音识别库，然后再导出来（原生语音识别所支持的版本为10.0及以上）。

实现

添加配置文件

在info.plist的文件中添加语音识别和麦克风使用的配置Privacy - Speech Recognition Usage Description和Privacy - Microphone Usage Description，如下图：

导入类库

#import <Speech/Speech.h>
#import <AVFoundation/AVFoundation.h>

定义相应的控件

@property (strong, nonatomic) SFSpeechRecognizer *speechRecognizer;
@property (strong, nonatomic) SFSpeechRecognitionTask *recognitionTask;
@property (strong, nonatomic) SFSpeechAudioBufferRecognitionRequest *recognitionRequest;

@property (strong, nonatomic) AVAudioEngine *audioEngine;
@property (strong, nonatomic) AVAudioSession *audioSession;

上面的三个是iOS的语音识别的类,下面两个是音频的类，对AVAudioEngine有兴趣的小伙伴可以去看一下说明1和说明2，对AVAudioSession有兴趣的小伙伴可以去看一下这里。
然后懒加载初始化相应的类，如下：

- (SFSpeechRecognizer *)speechRecognizer {
    if (!_speechRecognizer) {
        NSLocale *local = [[NSLocale alloc]initWithLocaleIdentifier:@"zh_CN"];
        
        _speechRecognizer = [[SFSpeechRecognizer alloc]initWithLocale:local];
        _speechRecognizer.delegate = self;
    }
    return _speechRecognizer;
}

- (AVAudioEngine *)audioEngine {
    if (!_audioEngine) {
        _audioEngine = [[AVAudioEngine alloc]init];
    }
    return _audioEngine;
}

- (AVAudioSession *)audioSession {
    if (!_audioSession) {
        _audioSession = [AVAudioSession sharedInstance];
        NSError *error;
        [_audioSession setCategory:AVAudioSessionCategoryRecord error:&error];
        [_audioSession setMode:AVAudioSessionModeMeasurement error:&error];
        [_audioSession setActive:YES withOptions:AVAudioSessionSetActiveOptionNotifyOthersOnDeactivation error:&error];
    }
    return _audioSession;
}

小编在初始化语音识别类的时候使用的是zh_CN,这个的意思是语音识别的类型是中文，可以根据需要替换成其他类型。可以通过下列代码获得所支持的类型：

for (NSLocale *temp in SFSpeechRecognizer.supportedLocales) {
        NSLog(@"国家代码：%@,语言代码：%@,输入方式:%@_%@", temp.countryCode, temp.languageCode, temp.languageCode, temp.countryCode);
    }

可以使用的类型为输入方式后面的输出。

获取相应的授权

- (void)viewDidLoad {
    [super viewDidLoad];
   
    [self accessPermissions];
}

- (void)accessPermissions {
//    语音识别授权
    [SFSpeechRecognizer requestAuthorization:^(SFSpeechRecognizerAuthorizationStatus status) {
        
        BOOL isEnabled = NO;
        NSString *str;
        switch (status) {
            case SFSpeechRecognizerAuthorizationStatusNotDetermined:
                isEnabled = NO;
                str = @"不支持录音";
                NSLog(@"结果未知 用户尚未进行选择");
                break;
            case SFSpeechRecognizerAuthorizationStatusDenied:
                isEnabled = NO;
                str = @"不支持录音";
                NSLog(@"用户未授权使用语音识别");
                break;
            case SFSpeechRecognizerAuthorizationStatusRestricted:
                isEnabled = NO;
                str = @"不支持录音";
                NSLog(@"设备不支持语音识别功能");
                break;
            case SFSpeechRecognizerAuthorizationStatusAuthorized:
                isEnabled = YES;
                str = @"开始录音";
                NSLog(@"用户授权语音识别");
                break;
            default:
                break;
        }
        
        dispatch_async(dispatch_get_main_queue(), ^{
            self.recordButton.enabled = isEnabled;
            if (isEnabled) {
                [self.recordButton setTitle:str forState:UIControlStateNormal];
            }else {
                [self.recordButton setTitle:str forState:UIControlStateDisabled];
            }
        });
    }];
    
//    麦克风使用授权
    if ([self.audioSession respondsToSelector:@selector(requestRecordPermission:)]) {
        [self.audioSession performSelector:@selector(requestRecordPermission:) withObject:^(BOOL granted) {
            dispatch_async(dispatch_get_main_queue(), ^{
                self.recordButton.enabled = granted;
                if (granted) {
                    NSLog(@"麦克风授权");
                    [self.recordButton setTitle:@"开始录音" forState:UIControlStateNormal];
                }else {
                    NSLog(@"麦克风未授权");
                    [self.recordButton setTitle:@"麦克风未授权" forState:UIControlStateDisabled];
                }
            });
        }];
    }
}

语音识别

开始录音

- (void)startRecording {
    if (_recognitionTask) {
        [_recognitionTask cancel];
        _recognitionTask = nil;
    }
    
    _recognitionRequest = [[SFSpeechAudioBufferRecognitionRequest alloc]init];
    AVAudioInputNode *inputNode = self.audioEngine.inputNode;
    _recognitionRequest.shouldReportPartialResults = YES;
    _recognitionTask = [self.speechRecognizer recognitionTaskWithRequest:_recognitionRequest resultHandler:^(SFSpeechRecognitionResult * _Nullable result, NSError * _Nullable error) {
        BOOL isFinal = NO;
        if (result) {
            self.displayLabel.text = result.bestTranscription.formattedString;
            isFinal = result.isFinal;
        }
        
        if (error || isFinal) {
            [self.audioEngine stop];
            [inputNode removeTapOnBus:0];
            self.recognitionTask = nil;
            self.recognitionRequest = nil;
            self.recordButton.enabled = YES;
            [self.recordButton setTitle:@"开始录音" forState:UIControlStateNormal];
        }
    }];
    
    AVAudioFormat *recordingFormat = [inputNode outputFormatForBus:0];
    [inputNode removeTapOnBus:0];
    [inputNode installTapOnBus:0 bufferSize:1024 format:recordingFormat block:^(AVAudioPCMBuffer * _Nonnull buffer, AVAudioTime * _Nonnull when) {
        if (self.recognitionRequest) {
            [self.recognitionRequest appendAudioPCMBuffer:buffer];
        }
    }];
    
    NSError *error;
    [self.audioEngine prepare];
    [self.audioEngine startAndReturnError:&error];
    self.displayLabel.text = @"正在录音。。。";
}

结束录音

- (void)endRecording {
    [self.audioEngine stop];
    if (_recognitionRequest) {
        [_recognitionRequest endAudio];
    }
    
    if (_recognitionTask) {
        [_recognitionTask cancel];
        _recognitionTask = nil;
    }
    
    self.recordButton.enabled = NO;
    
    self.displayLabel.text = @"";
}

绑定的语音识别代理

#pragma mark - SFSpeechRecognizerDelegate
- (void)speechRecognizer:(SFSpeechRecognizer *)speechRecognizer availabilityDidChange:(BOOL)available {
    
    if (available) {
        NSLog(@"开始录音");
        [self.recordButton setTitle:@"开始录音" forState:UIControlStateNormal];
    }else {
        NSLog(@"语音识别不可用");
        [self.recordButton setTitle:@"语音识别不可用" forState:UIControlStateDisabled];
    }
    self.recordButton.enabled = available;
}

希望这篇文章对各位小伙伴有所帮助，想要Demo的小伙伴点击这里(注意:语音识别只能进行真机调试)

iOS-语音识别
前言语音识别现在已经在开发中越来越常见了，科大讯飞、百度等第三方库层出不穷，在这里简单的介绍一下iOS原生的语音...
智能语音客服服务助手
智能语音客服服务助手语音识别阿里语音识别百度语音识别讯飞语音识别语音合成阿里语音合成百度语音合成讯...
AI语音基本原理
一、语音识别分类：（1）特定人的语音识别——只识别指定人的语音，使用前需要训练；（2）非指定人的语音识别二、...
语音识别竞争激烈超乎想象！亚马逊崛起与微软衰落形成巨大反差
语音识别是一种可以识别口语单词的技术，然后可以将其转换为文本。语音识别的一个子集是语音识别，这是一种基于语音识别人...
语音识别中英文术语
iat 语音听写 asr Automatic Speech Recognition语音识别，也被称为自动语音识别 ...
帮你进一步了解智能语音识别技术
智能语音识别是以语音为研究对象，通过语音信号处理和模式识别让机器自动识别和理解人类口述的语言。智能语音识别技术就是...
iOS-车牌识别-EasyPR简单集成
iOS-车牌识别-EasyPR简单集成 1、下载demo https://github.com/zhoushiwe...
语音识别技术基础理解
语音识别是以语音为研究对象，通过语音信号处理和模式识别让机器自动识别和理解人类口述的语言。语音识别技术就是让机器通...
NLP
本地搜索文本匹配，与文本转化为声音匹配。与语音识别翻译 ML:搜索识别，语音识别，文字识别，图像...
TransWAI：高效实现语音转文字，减少视频翻译周期
语音识别技术（Automatic Speech Recognition），也被称为自动语音识别，目的在于将语音中的...