A dictionary should contain all the words you are interested in, otherwise the recognizer will not be able to recognize them. However, it is not sufficient to have the words in the dictionary. The recognizer looks for a word in both the dictionary and the language model. Without the language model, a word will not be recognized, even if it is present in the dictionary.

There is no need to remove unused words from the dictionary unless you want to save memory, extra words in the dictionary do not affect accuracy.

词典应包含您感兴趣的所有单词，否则识别器将无法识别它们。但是，仅仅在拼音字典中囊括这些单词也是不行的，因为识别器会同时从拼音字典和语音模型中进行检索。如果该单词不在语音模型中，即使这个单词在拼音字典中，也不能被识别出来。

当然，我们也不需要为了节约存储空间而将拼音字典中不需要识别的单词移除，这样做其实对提升性能没有任何太大的意义，因为多余的单词并不影响识别的准确率。

使用现有模型

There are a number of dictionaries which cover languages we support – CMUDict for US English, French, German, Russian, Dutch, Italian, Spanish and Mandarin.

CMUDict 包含了许多我们支持的语种的词典，包括US English, French, German, Russian, Dutch, Italian, Spanish and 普通话。（我不确定cmudict是否支持中文，实际上我还不知道cmudict是干什么用的）

使用g2p-seq2seq扩展字典

An English model 2-layer LSTM with 512 hidden units is available for download on the CMUSphinx website. Unpack the model after downloading. It is trained on the CMU English dictionary. As the name says, this model works only for English. For other languages you first need to bootstrap a dictionary as described below and then use the G2P tool to extend it.

可在CMUSphinx网站上下载带有512个隐藏单元的2层LSTM英语模型。下载后解压模型。它在CMU英语词典中受过训练。顾名思义，此模型仅适用于英语。对于其他语言，需要引导字典，然后使用G2P工具对其进行扩展。

先用英语模型来看看G2P工具是如何工作的，最简单的方法是以交互方式运行它并输入单词：

$ g2p-seq2seq --interactive --model_dir model_folder_path
...
> hello
...
HH EH L OW
...
>

其中model_folder_path是上面下载的英语模型路径，如上所示，执行第一行命令后，输入hello，会查询出对应的音素HH EH L OW

要使用经过训练的模型为英语单词列表（单词表是一个文本文件，每行一个单词）生成发音，请运行：

$ g2p-seq2seq --decode your_wordlist --model_dir model_folder_path --output decode_output_file_path

要评估训练模型的字错误率，请运行：

g2p-seq2seq --evaluate your_test_dictionary --model_dir model_folder_path

引导中文字典

通常，字典是用手写规则引导的。可以去百度或者维基百科找到中文（普通话）的语言音素列表，并将单词映射到音素。

转录几千个最常用的单词就足以引导字典。

字典启动后，可以使用g2p-seq2seq工具扩展它，以容纳更大的词汇。

官方中文字典

引导中文字典是通过官方文档理解并翻译过来的，未进行完整的实践，难道官方的中文语言模型中包含的字典不可以直接拿来用吗？而且维基百科找不到所谓的音素表，也不给个链接─━ _ ─━✧

反正我直接用cmusphinx-zh-cn-5.2.tar.gz里的字典

参考资料

建立语音词典（官方文档）：https://cmusphinx.github.io/wiki/tutorialdict/#using-existing-dictionaries

G2P工具：https://github.com/cmusphinx/g2p-seq2seq（这里有G2P使用说明，命令行和官方文档有差异，我文章里写的命令行是自己验证可行的）

CMU Sphinx 语音识别入门：构建拼音字典：https://www.jianshu.com/p/796de6301918

网友评论

本文标题：iOS-PocketSphinx——建立语音词典

本文链接：https://www.haomeiwen.com/subject/dblbqltx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

iOS-PocketSphinx——建立语音词典

传送门

系统环境

使用前提

介绍

使用现有模型

使用g2p-seq2seq扩展字典

引导中文字典

官方中文字典

参考资料

相关文章