speechrecognition简介:
speechrecognition 包,集合了几个语音识别库的接口:
recognize_bing():Microsoft Bing Speech
recognize_google():Google Web Speech API
recognize_google_cloud():Google Cloud Speech-requires installation of the google-cloud-speech package
recognize_houndify():Houndify by SoundHound
recognize_ibm():IBM Speech to Text
recognize_sphinx():CMU Sphinx-requires instaling PocketSphinx
recognize_wit():Wit.ai
speechrecognition.recognize_sphinx():
使用时无需网络连接。
模块安装(使用speechrecognition时需要pocketsphinx):
pip install pocketsphinx
pip install speechrecognition
音频格式:
image.png测试代码:
# -*- coding: GBK -*-
import speech_recognition as sr #加载包
def wav2txt(wavfilepath,str_language):
r = sr.Recognizer()
sudio = ""
with sr.AudioFile(wavfilepath) as src:
sudio = r.record(src)
print(r.recognize_sphinx(sudio,language=str_language))
filePath1=r'16k.wav'
filePath2=r'audio-file.flac'
# 默认只有英文模型,中文模型要自行安装
wav2txt(filePath1,"zh-CN")
wav2txt(filePath2,"en-US")
测试结果:
image.png测试文件:
16k.wav :中文,点我跳转github
audio-file.flac :英文,点我下载
中文模型安装:
进入 https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/Mandarin/
下载 cmusphinx-zh-cn-5.2.tar.gz
解压后目录结构:
+--cmusphinx-zh-cn-5.2
| +--README
| +--zh_cn.cd_cont_5000
| | +--feat.params
| | +--feature_transform
| | +--mdef
| | +--means
| | +--mixture_weights
| | +--noisedict
| | +--transition_matrices
| | +--variances
| +--zh_cn.dic
| +--zh_cn.lm.bin
重命名部分文件:
cmusphinx-zh-cn-5.2 》》 zh-CN
zh_cn.cd_cont_5000 》》 acoustic-model
zh_cn.lm.bin 》》 language-model.lm.bin
zh_cn.dic 》》 pronounciation-dictionary.dict
重命名后目录结构:
+--zh-CN
| +--acoustic-model
| | +--feat.params
| | +--feature_transform
| | +--mdef
| | +--means
| | +--mixture_weights
| | +--noisedict
| | +--transition_matrices
| | +--variances
| +--language-model.lm.bin
| +--pronounciation-dictionary.dict
| +--README
轮子有了,放在哪里?可以通过错误信息来知道(随便输入一个语言模型让程序报错,比如wav2txt(filePath1,"Where")):
image.png
把zh-CN文件夹放到Where的位置就好了。
网友评论