传送门
当前文章:《iOS-PocketSphinx——安装PocketSphinx》
《iOS-PocketSphinx——构建iOS使用的SDK》
系统环境
Mac OS 10.15.6
CMUSphinx工具包概述
- Sphinxbase — Pocketsphinx所需要的支持库,主要完成的是语音信号的特征提取。
- Pocketsphinx — 用C语言编写的轻量级识别库,主要是进行识别的。
- Sphinxtrain — 声学模型训练工具
安装PocketSphinx
PocketSphinx是一个依赖于另一个名为SphinxBase的库的库,该库在所有CMUSphinx项目中提供通用功能。要安装Pocketsphinx,需要同时安装Pocketsphinx和Sphinxbase。
将sphinxbase、pocketsphinx放在同一目录下
首先构建并安装SphinxBase(sphinxbase必须先安装)
$ cd sphinxbase
$ ./autogen.sh
$ make
$ sudo make install
然后构建Pocketsphinx(步骤相同)
$ cd pocketsphinx
$ ./autogen.sh
$ make
$ sudo make install
安装SphinxTrain
$ cd sphinxtrain
$ ./autogen.sh
$ make
$ sudo make install
测试PocketSphinx
运行pocketsphinx_continuous -inmic yes检查其是否可以识别您在麦克风中说的单词
$ pocketsphinx_continuous -inmic yes
会出现Ready... Listening...
对麦克风说 "hello",终端也识别出了"hello",测试成功
INFO: continuous.c(275): Ready....
INFO: continuous.c(261): Listening...
INFO: cmn_live.c(120): Update from < 15.47 11.52 -23.59 5.01 -7.99 -10.39 3.40 -10.32 6.99 2.10 -3.12 3.31 -4.84 >
INFO: cmn_live.c(138): Update to < 17.67 16.51 -23.53 6.89 -6.99 -10.99 4.08 -9.21 8.38 2.91 -1.46 4.42 -5.29 >
INFO: ngram_search_fwdtree.c(1550): 3949 words recognized (43/fr)
INFO: ngram_search_fwdtree.c(1552): 347666 senones evaluated (3821/fr)
INFO: ngram_search_fwdtree.c(1556): 2459122 channels searched (27023/fr), 61062 1st, 148131 last
INFO: ngram_search_fwdtree.c(1559): 7585 words for which last channels evaluated (83/fr)
INFO: ngram_search_fwdtree.c(1561): 213126 candidate words for entering last phone (2342/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 1.12 CPU 1.227 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 2.46 wall 2.704 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 173 words
INFO: ngram_search_fwdflat.c(948): 2874 words recognized (32/fr)
INFO: ngram_search_fwdflat.c(950): 128628 senones evaluated (1413/fr)
INFO: ngram_search_fwdflat.c(952): 264648 channels searched (2908/fr)
INFO: ngram_search_fwdflat.c(954): 12085 words searched (132/fr)
INFO: ngram_search_fwdflat.c(957): 8934 word transitions (98/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.12 CPU 0.137 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.13 wall 0.138 xRT
INFO: ngram_search.c(1250): lattice start node <s>.0 end node </s>.57
INFO: ngram_search.c(1276): Eliminated 1 nodes before end node
INFO: ngram_search.c(1381): Lattice has 483 nodes, 4953 links
INFO: ps_lattice.c(1376): Bestpath score: -2866
INFO: ps_lattice.c(1380): Normalizer P(O) = alpha(</s>:57:89) = -216645
INFO: ps_lattice.c(1437): Joint P(O,S) = -247947 P(S|O) = -31302
INFO: ngram_search.c(872): bestpath 0.02 CPU 0.019 xRT
INFO: ngram_search.c(875): bestpath 0.02 wall 0.019 xRT
hello
测试识别语音文件
cd到音频文件所在目录,用pocketsphinx_continuous
命令识别005.wav音频文件,输出audio.result文本文件(音频文件必须是单声道16kHz)
$ cd /Users/.../pocketsphinx/test/data/cards
$ pocketsphinx_continuous -infile 005.wav > audio.result
打开audio.result,内容为:
eight of spades for up close seven of hearts
识别成功
安装其他工具
《iOS-PocketSphinx——安装tensorflow的坎坷过程》
《iOS-PocketSphinx——安装g2p-seq2seq》
参考资料:
使用PocketSphinx构建应用程序(官方教程):https://cmusphinx.github.io/wiki/tutorialpocketsphinx/
CMUSphinx文档:https://cmusphinx.github.io/wiki/
网友评论