SWift5.0-系统API实现语音识别

作者: 31313_iOS | 来源:发表于2019-07-12 18:59 被阅读0次

SWift5.0-系统API实现语音识别
iOS10新特性－sessions 509 语音识别－Speec
用别人的代码，用我们的Python，简单做个语音识别系统！
SWift5.0-调用系统API语音阅读文本
语音打断功能——深入语音识别技术，设计语音用户界面（VUI）
iOS 10 Speech Recognition
golang 使用科大讯飞进行语音合成与识别
iOS开发基于系统原生的语音识别助手
Web Speech API 实现语音合成和语音识别
百度AI 2018-10-16

最近才发现，其实系统已经提供了语音转成文字的API，于是立马去试了一下。结果发现很不错，速度也还可以。支持iOS10级以上的版本，需要麦克风权限和语音识别权限。下面来看看如何达到效果（没有深入的理解，但是可以用了）。

一、引入系统库

import Speech

二、主要的类

SFSpeechRecognizer声音处理器
langugeSimple 要语音识别的语言简称，具体有哪些可以看上一篇语音阅读

SFSpeechRecognizer(locale: Locale(identifier: langugeSimple))

这句会根据传入的语言简称来返回一个声音处理器，如果不支持，怎会返回nil。更多细节可以查看官方文档。

通过下面的方法来得到语音识别的结果

 // Recognize speech utterance with a request
    // If request.shouldReportPartialResults is true, result handler will be called
    // repeatedly with partial results, then finally with a final result or an error.
    open func recognitionTask(with request: SFSpeechRecognitionRequest, resultHandler: @escaping (SFSpeechRecognitionResult?, Error?) -> Void) -> SFSpeechRecognitionTask

AVAudioEngine专门用来处理声音的数据
这里细节不在介绍了，直接上使用的代码

 lazy var audioEngine: AVAudioEngine = {
        let audioEngine = AVAudioEngine()
        audioEngine.inputNode.installTap(onBus: 0, bufferSize: 1024, format: audioEngine.inputNode.outputFormat(forBus: 0)) { (buffer, audioTime) in
            // 为语音识别请求对象添加一个AudioPCMBuffer，来获取声音数据
            self.recognitionRequest.append(buffer)
        }
        return audioEngine
    }()

SFSpeechAudioBufferRecognitionRequest语音识别器

  // 语音识别器
    lazy var recognitionRequest: SFSpeechAudioBufferRecognitionRequest = {
        let recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
        return recognitionRequest
    }()

SFSpeechRecognitionTask语言识别任务管理器，启用和关闭都要使用这个管理进行

public enum SFSpeechRecognitionTaskState : Int {

    case starting // Speech processing (potentially including recording) has not yet begun

    case running // Speech processing (potentially including recording) is running

    case finishing // No more audio is being recorded, but more recognition results may arrive

    case canceling // No more recognition reuslts will arrive, but recording may not have stopped yet

    case completed // No more results will arrive, and recording is stopped.
}

二、完整的代码

import UIKit
import Speech

enum LGSpeechType: Int {
    case start
    case stop
    case finished
    case authDenied
}

typealias LGSpeechBlock = (_ speechType: LGSpeechType, _ finalText: String?) -> Void

@available(iOS 10.0, *)

class LGSpeechManager: NSObject {

    private var parentVc: UIViewController!
    private var speechTask: SFSpeechRecognitionTask?
    // 声音处理器
    private var speechRecognizer: SFSpeechRecognizer?
    
    static let share = LGSpeechManager()
    
    private var block: LGSpeechBlock?
    
    // 语音识别器
    lazy var recognitionRequest: SFSpeechAudioBufferRecognitionRequest = {
        let recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
        return recognitionRequest
    }()
    
 
    
    lazy var audioEngine: AVAudioEngine = {
        let audioEngine = AVAudioEngine()
        audioEngine.inputNode.installTap(onBus: 0, bufferSize: 1024, format: audioEngine.inputNode.outputFormat(forBus: 0)) { (buffer, audioTime) in
            // 为语音识别请求对象添加一个AudioPCMBuffer，来获取声音数据
            self.recognitionRequest.append(buffer)
        }
        return audioEngine
    }()
    
    
    func lg_startSpeech(speechVc: UIViewController, langugeSimple: String, speechBlock: @escaping LGSpeechBlock) {
        parentVc = speechVc
        block = speechBlock
        
        lg_checkmicroPhoneAuthorization { (microStatus) in
            if microStatus {
                self.lg_checkRecognizerAuthorization(recongStatus: { (recStatus) in
                    if recStatus {
                        //  初始化语音处理器的输入模式 语音处理器准备就绪（会为一些audioEngine启动时所必须的资源开辟内存）
                        self.audioEngine.prepare()
                        if (self.speechTask?.state == .running) {   // 如果当前进程状态是进行中
                            // 停止语音识别
                           self.lg_stopDictating()
                        } else {   // 进程状态不在进行中
                            self.speechRecognizer = SFSpeechRecognizer(locale: Locale(identifier: langugeSimple))
                            guard (self.speechRecognizer != nil) else {
                                self.showAlert("抱歉，暂不支持当前地区使用语音输入")
                                return
                            }
                            self.lg_setCallBack(type: .start, text: nil)
                            // 开启语音识别
                            self.lg_startDictating()
                        }
                    } else {
                        self.showAlert("您已取消授权使用语音识别，如果需要使用语音识别功能，可以到设置中重新开启！")
                        self.lg_setCallBack(type: .authDenied, text: nil)
                    }
                })
            } else {
                //麦克风没有授权
                self.showAlert("您已取消授权使用麦克风，如果需要使用语音识别功能，可以到设置中重新开启！")
                self.lg_setCallBack(type: .authDenied, text: nil)
            }
        }
    }
}


@available(iOS 10.0, *)
extension LGSpeechManager: SFSpeechRecognitionTaskDelegate {
    
    //判断语音识别权限
    private func lg_checkRecognizerAuthorization(recongStatus: @escaping (_ resType: Bool) -> Void) {
        let authorStatus = SFSpeechRecognizer.authorizationStatus()
        if authorStatus == .authorized {
            recongStatus(true)
        } else if authorStatus == .notDetermined {
            SFSpeechRecognizer.requestAuthorization { (status) in
                if status == .authorized {
                    recongStatus(true)
                } else {
                    recongStatus(false )
                }
            }
        } else {
            recongStatus(false)
        }
    }
    
    //检测麦克风
    private func lg_checkmicroPhoneAuthorization(authoStatus: @escaping (_ resultStatus: Bool) -> Void) {
        let microPhoneStatus = AVCaptureDevice.authorizationStatus(for: .audio)

        if microPhoneStatus == .authorized {
            authoStatus(true)
        } else if microPhoneStatus == .notDetermined {
            AVCaptureDevice.requestAccess(for: .audio, completionHandler: {(res) in
                if res {
                    authoStatus(true)
                } else {
                    authoStatus(false)
                }
            })
        } else {
            authoStatus(false)
        }
    }
    
    //开始进行
    private func lg_startDictating() {
        do {
            try audioEngine.start()
            speechTask = speechRecognizer!.recognitionTask(with: recognitionRequest) { (speechResult, error) in
                // 识别结果，识别后的操作
                if speechResult == nil {
                    return
                }
                self.lg_setCallBack(type: .finished, text: speechResult!.bestTranscription.formattedString)
            }
        } catch  {
            print(error)
            self.lg_setCallBack(type: .finished, text: nil)
        }
    }
    
    // 停止声音处理器，停止语音识别请求进程
    func lg_stopDictating() {
        lg_setCallBack(type: .stop, text: nil)
        audioEngine.stop()
        recognitionRequest.endAudio()
        speechTask?.cancel()
    }
    
    private func lg_setCallBack(type: LGSpeechType, text: String?) {
        if block != nil {
            block!(type, text)
        }
    }
    
    private func showAlert(_ message: String) {
        let alertVC = UIAlertController(title: nil, message: message, preferredStyle: .alert)
        let firstAction = UIAlertAction(title: "知道了", style: .default, handler: {(action) in
        })
        alertVC.addAction(firstAction)
        parentVc.present(alertVC, animated: true, completion: nil)
    }
}

SWift5.0-系统API实现语音识别
最近才发现，其实系统已经提供了语音转成文字的API，于是立马去试了一下。结果发现很不错，速度也还可以。支持iOS1...
iOS10新特性－sessions 509 语音识别－Speec
在 iOS 10 中增加语音识别的API——Speech ，其特点如下： • 可以实现连续的语音识别 • 可以对语...
用别人的代码，用我们的Python，简单做个语音识别系统！
主要是介绍python实现百度语音识别api的具体代码，若文章中示例有不懂的可以详细参考百度语音识别api文档。 ...
SWift5.0-调用系统API语音阅读文本
语音阅读文本需要使用三个主要的类 AVSpeechSynthesizer、AVSpeechUtterance、A...
语音打断功能——深入语音识别技术，设计语音用户界面（VUI）
小编说：在语音识别技术的实现过程中，有一个会大大影响设计的语音识别技术是“语音打断”，即你是否允许用户打断系统说话...
iOS 10 Speech Recognition
IOS10.0系统新增了Speech，语音识别的API 下面是一个简单的demo，可以从视频中读取语音，然后进行语...
golang 使用科大讯飞进行语音合成与识别
golang 使用科大讯飞进行语音合成与识别使用科大讯飞 API 进行语音合成和识别，可识别wav和pcm文件 ...
iOS开发基于系统原生的语音识别助手
语音识别系统是基于系统的speech.framework来实现的写的demo已经上传github: git地址-...
Web Speech API 实现语音合成和语音识别
MDN 使用 Web Speech API 主要包含以下两部分： Speech recognition 语音识别 ...
百度AI 2018-10-16
安装baidu-aip：pip install baidu-aip语音合成语音识别利用语音识别和语音合成实现学...