美文网首页
2020-02-20 语音识别speechrecognition

2020-02-20 语音识别speechrecognition

作者: 菜菜笛 | 来源:发表于2020-02-20 00:24 被阅读0次

    speechrecognition简介:

    speechrecognition 包,集合了几个语音识别库的接口:
    recognize_bing():Microsoft Bing Speech
    recognize_google():Google Web Speech API
    recognize_google_cloud():Google Cloud Speech-requires installation of the google-cloud-speech package
    recognize_houndify():Houndify by SoundHound
    recognize_ibm():IBM Speech to Text
    recognize_sphinx():CMU Sphinx-requires instaling PocketSphinx
    recognize_wit():Wit.ai

    speechrecognition.recognize_sphinx():

    使用时无需网络连接。

    模块安装(使用speechrecognition时需要pocketsphinx):

    pip install pocketsphinx 
    pip install speechrecognition
    

    音频格式:

    image.png

    测试代码:

    # -*- coding: GBK -*-
    import speech_recognition as sr #加载包
    
    def wav2txt(wavfilepath,str_language):
        r = sr.Recognizer()
        sudio = ""
        with sr.AudioFile(wavfilepath) as src:
            sudio = r.record(src)
        print(r.recognize_sphinx(sudio,language=str_language))
    
    filePath1=r'16k.wav'
    filePath2=r'audio-file.flac'
    # 默认只有英文模型,中文模型要自行安装
    wav2txt(filePath1,"zh-CN")
    wav2txt(filePath2,"en-US")
    

    测试结果:

    image.png

    测试文件:

    16k.wav :中文,点我跳转github
    audio-file.flac :英文,点我下载

    中文模型安装:

    进入 https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/Mandarin/
    下载 cmusphinx-zh-cn-5.2.tar.gz
    解压后目录结构:

    +--cmusphinx-zh-cn-5.2
    | +--README
    | +--zh_cn.cd_cont_5000
    | | +--feat.params
    | | +--feature_transform
    | | +--mdef
    | | +--means
    | | +--mixture_weights
    | | +--noisedict
    | | +--transition_matrices
    | | +--variances
    | +--zh_cn.dic
    | +--zh_cn.lm.bin

    重命名部分文件:

    cmusphinx-zh-cn-5.2 》》 zh-CN
    zh_cn.cd_cont_5000 》》 acoustic-model
    zh_cn.lm.bin 》》 language-model.lm.bin
    zh_cn.dic 》》 pronounciation-dictionary.dict

    重命名后目录结构:

    +--zh-CN
    | +--acoustic-model
    | | +--feat.params
    | | +--feature_transform
    | | +--mdef
    | | +--means
    | | +--mixture_weights
    | | +--noisedict
    | | +--transition_matrices
    | | +--variances
    | +--language-model.lm.bin
    | +--pronounciation-dictionary.dict
    | +--README

    轮子有了,放在哪里?可以通过错误信息来知道(随便输入一个语言模型让程序报错,比如wav2txt(filePath1,"Where")):


    image.png

    把zh-CN文件夹放到Where的位置就好了。

    原文章在这里,内容都取自于他,感谢作者。这一篇是小弟用来记录学习过程的。

    相关文章

      网友评论

          本文标题:2020-02-20 语音识别speechrecognition

          本文链接:https://www.haomeiwen.com/subject/gcftqhtx.html