fstaddselfloops:
Adds self-loops to states of an FST to propagate disambiguation symbols through it
They are added on each final state and each state with non-epsilon output symbols
on at least one arc out of the state. Useful in conjunction with predeterminize
Usage: fstaddselfloops in-disambig-list out-disambig-list [in.fst [out.fst] ]
E.g: fstaddselfloops in.list out.list < in.fst > withloops.fst
in.list and out.list are lists of integers, one per line, of the
same length.
例如:fstaddselfloops 'echo 276 |' 'echo 306221 |'
prepare_lang.sh:
utils/make_lexicon_fst.pl --pron-probs $tmpdir/lexiconp_disambig.txt $sil_prob $silphone '#'$ndisambig | \
fstcompile --isymbols=$dir/phones.txt --osymbols=$dir/words.txt \
--keep_isymbols=false --keep_osymbols=false | \
fstaddselfloops "echo $phone_disambig_symbol |" "echo $word_disambig_symbol |" | \
fstarcsort --sort_type=olabel > $dir/L_disambig.fst || exit 1;
l.fst l_disambig.fst
fstarcsort
fstarcsort --sort_type=ilabel|olabel [in.fst [out.fst]]
fstarcsort --sort_type=olabel > $dir/L_disambig.fst || exit 1;
fstarcsort --sort_type=olabel G.fst |
fstrmepsilon
移除空转移(输入和输出都为空)
fstrmepsilon | fstarcsort --sort_type=ilabel > ./data/langzh/G.fst
先上脚本
if [[ ! -s $lang/tmp/LG.fst || $lang/tmp/LG.fst -ot $lang/G.fst || \
$lang/tmp/LG.fst -ot $lang/L_disambig.fst ]]; then
fsttablecompose $lang/L_disambig.fst $lang/G.fst | fstdeterminizestar --use-log=true | \
fstminimizeencoded | fstpushspecial > $lang/tmp/LG.fst.$$ || exit 1;
mv $lang/tmp/LG.fst.$$ $lang/tmp/LG.fst
fstisstochastic $lang/tmp/LG.fst || echo "[info]: LG not stochastic."
fi
fsttablecompose
Usage: fsttablecompose (fst1-rxfilename|fst1-rspecifier) (fst2-rxfilename|fst2-rspecifier) [(out-rxfilename|out-rspecifier)]
fsttablecompose $lang/L_disambig.fst $lang/G.fst $lang /LG.fst
fstdeterminizestar
确保每个状态相同的输入只有一条路径
fstminimizeencoded
Minimizes FST after encoding [similar to fstminimize, but no weight-pushing]
合并路径
fstpushspecial
权重集中在前面,便于搜索
fstisstochastic
Checks whether an FST is stochastic and exits with success if so.
检查是否包含<s> </s> ??
fstisstochastic ./data/langzh/G.fst
fstproject
fstproject --project_output=true ./data/langzh/G.fst > ./data/langzh/G_prj.fst
fstreverse
头尾互换
fstinvert
输入输出互换:
fstreplace
Usage: fstreplace.exe root.fst rootlabel [rule1.fst label1 ...] [out.fst]
PROGRAM FLAGS:
--epsilon_on_replace: type = bool, default = false
Create an espilon arc when recursing
--modules: type = bool, default = false
Use modules mode
fstreplace --epsilon_on_replace data\lang\tmp\main.fst.org -1 data\lang\tmp\test.fst.org 337 data\lang\tmp\pinyin.fst.org 259 data\lang\tmp\lm.fst
写在最后
HCLG=asl(min(rds(det(H' omin(det(C omin(det(LoG))))))))
fstarcsort --sort_type=ilabel data\lang\tmp\CLG_3_1.fst data\lang\tmp\CLG_3_1.fst
make-h-transducer --disambig-syms-out=data\lang\graph/disambig_tid.int --transition-scale=1.000000 --print-args=false data\lang\tmp/ilabels_3_1 data\model/tree data\model/final.mdl data\lang\tmp/output
fsttablecompose --print-args=false data\lang\graph\Ha.fst data\lang\tmp\CLG_3_1.fst data\lang\tmp/output
fstdeterminizestar --print-args=false --use-log=true data\lang\tmp/input data\lang\tmp/output
fstrmsymbols --print-args=false data\lang\graph\disambig_tid.int data\lang\tmp/input data\lang\tmp/output
fstrmepslocal --print-args=false data\lang\tmp/input data\lang\tmp/output
fstminimizeencoded --print-args=false data\lang\tmp/input data\lang\tmp/output
fstisstochastic --print-args=false data\lang\graph\HCLGa.fst
add-self-loops --print-args=false --self-loop-scale=0.100000 --reorder=true data\model\final.mdl data\lang\graph\HCLGa.fst data\lang\graph\HCLG.fst
网友评论