美文网首页
timit实例训练

timit实例训练

作者: 伊恩的道歉 | 来源:发表于2018-06-14 20:58 被阅读58次

    TIMIT S5实例:

    首先,将TIMIT.ISO中的TIMIT复制到主文件夹。

    1.进入对应的目录,进行如下操作:

    zhangju@ubuntu :~$ cd kaldi-trunk/egs/timit/s5/

    zhangju@ubuntu :~/kaldi-trunk/egs/timit/s5$

    sudo local/timit_data_prep.sh /home/zhangju/TIMIT

    会看到如下显示:

    Creating coretest set.

    MDAB0  MWBT0  FELC0  MTAS1  MWEW0  FPAS0  MJMP0  MLNT0  FPKT0  MLLL0  MTLS0  FJLM0  MBPM0  MKLT0  FNLP0  MCMJ0  MJDH0  FMGD0  MGRT0  MNJM0  FDHC0  MJLN0  MPAM0  FMLD0

    # of utterances in coretest set = 192

    Creating dev set.

    FAKS0  FDAC1  FJEM0  MGWT0  MJAR0  MMDB1  MMDM2  MPDF0  FCMH0  FKMS0  MBDG0  MBWM0  MCSH0  FADG0  FDMS0  FEDW0  MGJF0  MGLB0  MRTK0  MTAA0  MTDT0  MTHC0  MWJG0  FNMR0  FREW0  FSEM0  MBNS0  MMJR0  MDLS0  MDLF0  MDVC0  MERS0  FMAH0  FDRW0  MRCS0  MRJM4  FCAL1  MMWH0  FJSJ0  MAJC0  MJSW0  MREB0  FGJD0  FJMG0  MROA0  MTEB0  MJFC0  MRJR0  FMML0  MRWS1

    # of utterances in dev set = 400

    Finalizing test

    Finalizing dev

    timit_data_prep succeeded.

    于是在/home/zhangju/kaldi-trunk/egs/timit/s5文件夹下新生成data文件夹,其内包含local文件夹以及相关内容。

    2 在终端输入:

    local/timit_train_lms.sh data/local(下载、计算文本,用以建立语言模型)

    local/timit_format_data.sh(处理与fst有关的东西)

    3创建train的mfcc:

    sudo steps/make_mfcc.sh data/train exp/make_mfcc/train mfccs 4

    (要对train,dev,test创建)

    会看到:

    Succeeded creating MFCC features for train

    sudo steps/make_mfcc.sh data/test exp/make_mfcc/test mfccs 4

    会看到:

    Succeeded creating MFCC features for test

    sudo steps/make_mfcc.sh data/dev exp/make_mfcc/dev mfccs 4

    会看到:

    Succeeded creating MFCC features for dev

    4训练单音素系统(monophone systom)

    sudo steps/train_mono.sh data/train data/lang exp/mono

    会显示:

    Computing cepstral mean and variance statistics

    Initializing monophone system.

    Compiling training graphs

    Pass 0

    Pass 1

    Aligning data

    Pass 2

    Aligning data

    Pass 3

    Aligning data

    Pass 4

    Aligning data

    Pass 5

    Aligning data

    Pass 6

    Aligning data

    Pass 7

    Aligning data

    Pass 8

    Aligning data

    Pass 9

    Aligning data

    Pass 10

    Aligning data

    Pass 11

    Pass 12

    Aligning data

    Pass 13

    Pass 14

    Pass 15

    Aligning data

    Pass 16

    Pass 17

    Pass 18

    Pass 19

    Pass 20

    Aligning data

    Pass 21

    Pass 22

    Pass 23

    Pass 24

    Pass 25

    Aligning data

    Pass 26

    Pass 27

    Pass 28

    Pass 29

    于是,新建了exp/mono文件夹

    scripts/mkgraph.sh --mono data/lang exp/mono exp/mono/graph(制图)

    会显示:

    fsttablecompose data/lang/L.fst data/lang/G.fst

    fstdeterminizestar --use-log=true

    fstminimizeencoded

    fstisstochastic data/lang/tmp/LG.fst

    -0.000244359 -0.0912761

    warning: LG not stochastic.

    fstcomposecontext --context-size=1 --central-position=0 --read-disambig-syms=data/lang/tmp/disambig_phones.list --write-disambig-syms=data/lang/tmp/disambig_ilabels_1_0.list data/lang/tmp/ilabels_1_0

    fstisstochastic data/lang/tmp/CLG_1_0.fst

    -0.000244359 -0.0912761

    warning: CLG not stochastic.

    make-h-transducer --disambig-syms-out=exp/mono/graph/disambig_tid.list --transition-scale=1.0 data/lang/tmp/ilabels_1_0 exp/mono/tree exp/mono/final.mdl

    fstminimizeencoded

    fstdeterminizestar --use-log=true

    fsttablecompose exp/mono/graph/Ha.fst data/lang/tmp/CLG_1_0.fst

    fstrmsymbols exp/mono/graph/disambig_tid.list

    fstrmepslocal

    fstisstochastic exp/mono/graph/HCLGa.fst

    0.000331581 -0.091291

    HCLGa is not stochastic

    add-self-loops --self-loop-scale=0.1 --reorder=true exp/mono/final.mdl

    5.for test in dev test ; do

      steps/decode_deltas.sh exp/mono data/$test data/lang exp/mono/decode_$test &

    done(解码test数据集(test是*/s5/data中dev、test文件夹中的test文件夹))

    终端输出结果是:[1] 2307

                          [2] 2308

    6.scripts/average_wer.sh exp/mono/decode_*/wer > exp/mono/wer

    会显示:

    [1]-  完成                  steps/decode_deltas.sh exp/mono data/$test data/lang exp/mono/decode_$test

    [2]+  完成                  steps/decode_deltas.sh exp/mono data/$test data/lang exp/mono/decode_$test

    7从单音素系统中获得alignments:(分别从mono文件夹中的train,dev,test中获得)(用以训练其他系统)

    steps/align_deltas.sh data/train data/lang exp/mono exp/mono_ali_train

    会显示:

    Computing cepstral mean and variance statistics

    Aligning all training data

    Done.

    方法二:修改run.sh中的timit路径,但后直接运行run.sh

    TIMIT S3实例

    1 数据准备,输入:

    local/timit_data_prep.sh  /home/zhangju/TIMIT

    终端显示:

    Creating coretest set.

    MDAB0  MWBT0  FELC0  MTAS1  MWEW0  FPAS0  MJMP0  MLNT0  FPKT0  MLLL0  MTLS0  FJLM0  MBPM0  MKLT0  FNLP0  MCMJ0  MJDH0  FMGD0  MGRT0  MNJM0  FDHC0  MJLN0  MPAM0  FMLD0  (这是说话人的名字,前面加M,F分别表示男性和女性)

    # of utterances in coretest set = 192 (核心测试集中有192句话)

    Creating dev set.

    FAKS0  FDAC1  FJEM0  MGWT0  MJAR0  MMDB1  MMDM2  MPDF0  FCMH0  FKMS0  MBDG0  MBWM0  MCSH0  FADG0  FDMS0  FEDW0  MGJF0  MGLB0  MRTK0  MTAA0  MTDT0  MTHC0  MWJG0  FNMR0  FREW0  FSEM0  MBNS0  MMJR0  MDLS0  MDLF0  MDVC0  MERS0  FMAH0  FDRW0  MRCS0  MRJM4  FCAL1  MMWH0  FJSJ0  MAJC0  MJSW0  MREB0  FGJD0  FJMG0  MROA0  MTEB0  MJFC0  MRJR0  FMML0  MRWS1

    # of utterances in dev set = 400 (设备集中有400句话)

    Finalizing test (完成test)

    Finalizing dev (完成dev)

    timit_data_prep succeeded.

    输入:

    local/timit_train_lms.sh data/local

    终端显示为

    Not installing the kaldi_lm toolkit since it is already there.

    (kaldi_lm工具箱里有:

    compute_perplexity计算复杂度(用于对语言模型作评估,复杂度越低越好)

    discount_ngrams给n阶语法模型作平滑处理(留出频率给实际会出现的但ngram中没出现的词语组合)

    get_raw_ngrams(得到原始n阶语法模型)

    get_word_map.pl*(得到词语的映射表)

    interpolate_ngrams(补充(修改)n阶语法模型)

    finalize_arpa.pl(完成arpa(arpa是一种格式,协议),是interpolate_ngrams程序中调用的)

    map_words_in_arpa.pl(得到arpa格式的词语)

    merge_ngrams(合并、融合n阶语法模型)

    merge_ngrams_online(在线合并、融合n阶语法模型)

    optimize_alpha.pl(使alpha最优化)

    prune_lm.sh(删去出现频率较低的数据)

    prune_ngrams(删去出现频率较低的数据)

    scale_configs.pl

    train_lm.sh(训练语言模型)

    uniq_to_ngrams

    Creating phones file, and monophone lexicon (mapping phones to itself). (创建音子文件及单音素词典)

    Creating biphone model(创建双音子模型)

    Training biphone language model in folder data/local/lm (训练双音子语言模型)

    Creating directory data/local/lm/biphone (创建目录data/local/lm/biphone )

    Getting raw N-gram counts ()

    Iteration 1/7 of optimizing discounting parameters

    discount_ngrams: for n-gram order 1, D=0.400000, tau=0.900000 phi=2.000000

    interpolate_ngrams: 60 words in wordslist

    discount_ngrams: for n-gram order 2, D=0.600000, tau=0.900000 phi=2.000000

    discount_ngrams: for n-gram order 3, D=0.800000, tau=1.100000 phi=2.000000

    discount_ngrams: for n-gram order 1, D=0.400000, tau=0.675000 phi=2.000000

    discount_ngrams: for n-gram order 2, D=0.600000, tau=0.675000 phi=2.000000

    discount_ngrams: for n-gram order 3, D=0.800000, tau=0.825000 phi=2.000000

    interpolate_ngrams: 60 words in wordslist

    discount_ngrams: for n-gram order 1, D=0.400000, tau=1.215000 phi=2.000000

    discount_ngrams: for n-gram order 2, D=0.600000, tau=1.215000 phi=2.000000

    discount_ngrams: for n-gram order 3, D=0.800000, tau=1.485000 phi=2.000000

    interpolate_ngrams: 60 words in wordslist

    Perplexity over 11412.000000 words is 17.013357

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.460842

    real  0m0.021s

    user  0m0.012s

    sys 0m0.000s

    Perplexity over 11412.000000 words is 17.016472

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.464985

    real  0m0.020s

    user  0m0.012s

    sys 0m0.000s

    Perplexity over 11412.000000 words is 17.021475

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.471402

    real  0m0.025s

    user  0m0.012s

    sys 0m0.000s

    optimize_alpha.pl: alpha=-2.1628504673 is too negative, limiting it to -0.5

    Projected perplexity change from setting alpha=-0.5 is 17.016472->17.0106241428571, reduction of 0.00584785714286085

    Alpha value on iter 1 is -0.5

    Iteration 2/7 of optimizing discounting parameters

    discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 2, D=0.600000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 3, D=0.600000, tau=0.550000 phi=2.000000

    interpolate_ngrams: 60 words in wordslist

    interpolate_ngrams: 60 words in wordslist

    discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 2, D=0.600000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 3, D=0.800000, tau=0.550000 phi=2.000000

    interpolate_ngrams: 60 words in wordslist

    discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 2, D=0.600000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 3, D=1.080000, tau=0.550000 phi=2.000000

    Perplexity over 11412.000000 words is 17.011355

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.457880

    real  0m0.018s

    user  0m0.004s

    sys 0m0.008s

    Perplexity over 11412.000000 words is 17.011355

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.457880

    real  0m0.022s

    user  0m0.012s

    sys 0m0.000s

    Perplexity over 11412.000000 words is 17.011355

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.457880

    real  0m0.019s

    user  0m0.008s

    sys 0m0.004s

    optimize_alpha.pl: objective function is not convex; returning alpha=0.7

    Projected perplexity change from setting alpha=0.7 is 17.011355->17.011355, reduction of 0

    Alpha value on iter 2 is 0.7

    Iteration 3/7 of optimizing discounting parameters

    discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 2, D=0.600000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 3, D=1.360000, tau=0.412500 phi=2.000000

    interpolate_ngrams: 60 words in wordslist

    discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 2, D=0.600000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 3, D=1.360000, tau=0.550000 phi=2.000000

    interpolate_ngrams: 60 words in wordslist

    discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 2, D=0.600000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 3, D=1.360000, tau=0.742500 phi=2.000000

    interpolate_ngrams: 60 words in wordslist

    Perplexity over 11412.000000 words is 17.011355

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.457880

    real  0m0.020s

    user  0m0.012s

    sys 0m0.000s

    Perplexity over 11412.000000 words is 17.011355

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.457880

    real  0m0.019s

    user  0m0.008s

    sys 0m0.004s

    Perplexity over 11412.000000 words is 17.011355

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.457880

    real  0m0.021s

    user  0m0.012s

    sys 0m0.000s

    optimize_alpha.pl: objective function is not convex; returning alpha=0.7

    Projected perplexity change from setting alpha=0.7 is 17.011355->17.011355, reduction of 0

    Alpha value on iter 3 is 0.7

    Iteration 4/7 of optimizing discounting parameters

    discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 2, D=0.600000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=1.750000

    interpolate_ngrams: 60 words in wordslist

    discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 2, D=0.600000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.000000

    interpolate_ngrams: 60 words in wordslist

    discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 2, D=0.600000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.350000

    interpolate_ngrams: 60 words in wordslist

    Perplexity over 11412.000000 words is 17.011355

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.457880

    real  0m0.018s

    user  0m0.012s

    sys 0m0.000s

    Perplexity over 11412.000000 words is 17.011355

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.457880

    real  0m0.018s

    user  0m0.012s

    sys 0m0.000s

    Perplexity over 11412.000000 words is 17.011355

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.457880

    real  0m0.023s

    user  0m0.012s

    sys 0m0.000s

    optimize_alpha.pl: objective function is not convex; returning alpha=0.7

    Projected perplexity change from setting alpha=0.7 is 17.011355->17.011355, reduction of 0

    Alpha value on iter 4 is 0.7

    Iteration 5/7 of optimizing discounting parameters

    discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 2, D=0.450000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.700000

    interpolate_ngrams: 60 words in wordslist

    interpolate_ngrams: 60 words in wordslist

    discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 2, D=0.600000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.700000

    interpolate_ngrams: 60 words in wordslist

    discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 2, D=0.810000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.700000

    Perplexity over 11412.000000 words is 17.008195

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.454326

    real  0m0.019s

    user  0m0.008s

    sys 0m0.004s

    Perplexity over 11412.000000 words is 17.011355

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.457880

    real  0m0.019s

    user  0m0.012s

    sys 0m0.000s

    Perplexity over 11412.000000 words is 17.018212

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.465417

    real  0m0.021s

    user  0m0.012s

    sys 0m0.000s

    optimize_alpha.pl: alpha=-0.670499383475985 is too negative, limiting it to -0.5

    Projected perplexity change from setting alpha=-0.5 is 17.011355->17.0064832142857, reduction of 0.00487178571427904

    Alpha value on iter 5 is -0.5

    Iteration 6/7 of optimizing discounting parameters

    interpolate_ngrams: 60 words in wordslist

    discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 2, D=0.300000, tau=0.337500 phi=2.000000

    discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.700000

    interpolate_ngrams: 60 words in wordslist

    discount_ngrams: for n-gram order 2, D=0.300000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.700000

    interpolate_ngrams: 60 words in wordslist

    discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 2, D=0.300000, tau=0.607500 phi=2.000000

    discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.700000

    Perplexity over 11412.000000 words is 17.008198

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.454134

    real  0m0.019s

    user  0m0.012s

    sys 0m0.000s

    Perplexity over 11412.000000 words is 17.006972

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.452861

    real  0m0.020s

    user  0m0.012s

    sys 0m0.000s

    Perplexity over 11412.000000 words is 17.006526

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.452349

    real  0m0.022s

    user  0m0.012s

    sys 0m0.000s

    Projected perplexity change from setting alpha=0.280321158690507 is 17.006972->17.0064966287094, reduction of 0.000475371290633575

    Alpha value on iter 6 is 0.280321158690507

    Iteration 7/7 of optimizing discounting parameters

    discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 2, D=0.300000, tau=0.576145 phi=1.750000

    discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.700000

    interpolate_ngrams: 60 words in wordslist

    discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 2, D=0.300000, tau=0.576145 phi=2.350000

    discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.700000

    discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 2, D=0.300000, tau=0.576145 phi=2.000000

    discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.700000

    interpolate_ngrams: 60 words in wordslist

    interpolate_ngrams: 60 words in wordslist

    Perplexity over 11412.000000 words is 17.006845

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.452750

    real  0m0.019s

    user  0m0.012s

    sys 0m0.000s

    Perplexity over 11412.000000 words is 17.006575

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.452414

    real  0m0.021s

    user  0m0.012s

    sys 0m0.000s

    Perplexity over 11412.000000 words is 17.006336

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.452127

    real  0m0.022s

    user  0m0.012s

    sys 0m0.000s

    Projected perplexity change from setting alpha=0.690827338145686 is 17.006575->17.0062591109755, reduction of 0.000315889024498972

    Alpha value on iter 7 is 0.690827338145686

    Final config is:

    D=0.4 tau=0.45 phi=2.0

    D=0.3 tau=0.576144521410728 phi=2.69082733814569

    D=1.36 tau=0.935 phi=2.7

    Discounting N-grams.

    discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000

    discount_ngrams: for n-gram order 2, D=0.300000, tau=0.576145 phi=2.690827

    discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.700000

    Computing final perplexity

    Building ARPA LM (perplexity computation is in background)

    interpolate_ngrams: 60 words in wordslist

    interpolate_ngrams: 60 words in wordslist

    Perplexity over 11412.000000 words is 17.006029

    Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.451754

    17.006029

    输入

    local/timit_format_data.sh

    终端显示:

    Creating L.fst

    Done creating L.fst

    Creating L_disambig.fst

    Done creating L_disambig.fst

    Creating G.fst

    arpa2fst -

    \data\

    Processing 1-grams

    Processing 2-grams

    Connected 0 states without outgoing arcs.

    remove_oovs.pl: removed 0 lines.

    G.fst created. How stochastic is it ?

    fstisstochastic data/lang_test/G.fst

    0 -0.0900995

    fsttablecompose data/lang_test/L_disambig.fst data/lang_test/G.fst

    How stochastic is LG.fst.

    fstisstochastic data/lang_test/G.fst

    0 -0.0900995

    fstisstochastic

    fsttablecompose data/lang/L.fst data/lang_test/G.fst

    0 -0.0900994

    How stochastic is LG_disambig.fst.

    fsttablecompose data/lang_test/L_disambig.fst data/lang_test/G.fst

    fstisstochastic

    0 -0.0900994

    First few lines of lexicon FST:

    0  1      0.356674939

    0  1  sil  1.20397282

    1  2  aa  AA  1.20397282

    1  1  aa  AA  0.356674939

    1  1  ae  AE  0.356674939

    1  2  ae  AE  1.20397282

    1  1  ah  AH  0.356674939

    1  2  ah  AH  1.20397282

    1  1  ao  AO  0.356674939

    1  2  ao  AO  1.20397282

    timit_format_data succeeded.

    输入:mfccdir=mfccs

    for test in train test dev ; do

    >  steps/make_mfcc.sh data/$test exp/make_mfcc/$test $mfccdir 4

    > done

    终端显示:

    Succeeded creating MFCC features for train

    Succeeded creating MFCC features for test

    Succeeded creating MFCC features for dev

    2 训练单音素系统,终端输入:

    steps/train_mono.sh data/train data/lang exp/mono

    终端显示:

    Computing cepstral mean and variance statistics

    Initializing monophone system.

    Compiling training graphs

    Pass 0

    Pass 1

    Aligning data

    Pass 2

    Aligning data

    Pass 3

    Aligning data

    Pass 4

    Aligning data

    Pass 5

    Aligning data

    Pass 6

    Aligning data

    Pass 7

    Aligning data

    Pass 8

    Aligning data

    Pass 9

    Aligning data

    Pass 10

    Aligning data

    Pass 11

    Pass 12

    Aligning data

    Pass 13

    Pass 14

    Pass 15

    Aligning data

    Pass 16

    Pass 17

    Pass 18

    Pass 19

    Pass 20

    Aligning data

    Pass 21

    Pass 22

    Pass 23

    Pass 24

    Pass 25

    Aligning data

    Pass 26

    Pass 27

    Pass 28

    Pass 29

    scripts/mkgraph.sh --mono data/lang_test exp/mono exp/mono/graph(制图)

    终端显示:

    fsttablecompose data/lang_test/L_disambig.fst data/lang_test/G.fst

    fstminimizeencoded

    fstdeterminizestar --use-log=true

    fstisstochastic data/lang_test/tmp/LG.fst

    0 -0.0901494

    warning: LG not stochastic.

    fstcomposecontext --context-size=1 --central-position=0 --read-disambig-syms=data/lang_test/tmp/disambig_phones.list --write-disambig-syms=data/lang_test/tmp/disambig_ilabels_1_0.list data/lang_test/tmp/ilabels_1_0

    fstisstochastic data/lang_test/tmp/CLG_1_0.fst

    0 -0.0901494

    warning: CLG not stochastic.

    make-h-transducer --disambig-syms-out=exp/mono/graph/disambig_tid.list --transition-scale=1.0 data/lang_test/tmp/ilabels_1_0 exp/mono/tree exp/mono/final.mdl

    fsttablecompose exp/mono/graph/Ha.fst data/lang_test/tmp/CLG_1_0.fst

    fstdeterminizestar --use-log=true

    fstminimizeencoded

    fstrmsymbols exp/mono/graph/disambig_tid.list

    fstrmepslocal

    fstisstochastic exp/mono/graph/HCLGa.fst

    0 -0.0901494

    HCLGa is not stochastic

    add-self-loops --self-loop-scale=0.1 --reorder=true exp/mono/final.mdl

    3 解码测试的数据集,输入

    for test in dev test ; do

      steps/decode_deltas.sh exp/mono data/$test data/lang exp/mono/decode_$test &

    done

    终端显示:

    [1] 16368

    [2] 16369

    3.1计算结果,输入:

    scripts/average_wer.sh exp/mono/decode_*/wer > exp/mono/wer

    终端显示:

    [1]-  完成                  steps/decode_deltas.sh exp/mono data/$test data/lang exp/mono/decode_$test

    [2]+  完成                  steps/decode_deltas.sh exp/mono data/$test data/lang exp/mono/decode_$test

    4 从单音素系统中获得排列

    创建排列用以训练其他系统,如ANN-HMM。

    输入:

    steps/align_deltas.sh data/train data/lang exp/mono exp/mono_ali_train

    终端显示:

    Computing cepstral mean and variance statistics

    Aligning all training data

    Done.

    steps/align_deltas.sh data/dev data/lang exp/mono exp/mono_ali_dev

    方法二:修改相应的TIMIT路径之后,直接运行run.sh

    TIMIT S4实例此脚本是用于构建一个音位识别器

    WORKDIR=/home/zhangju/ss4(自己找个有空间的路径作为WORKDIR)

    mkdir -p $WORKDIR

    cp -r conf local utils steps path.sh $WORKDIR

    cd $WORKDIR

    . path.sh(此文件中的环境变量KALDIROOT要自己修改路径,改到自己裝的kaldi文件中。KALDIROOT=/home/mayuan/kaldi-trunk(我用nano改的。))

    local/timit_data_prep.sh --config-dir=$PWD/conf --corpus-dir=/home/zhangju/TIMIT --work-dir=$WORKDIR

    相关文章

      网友评论

          本文标题:timit实例训练

          本文链接:https://www.haomeiwen.com/subject/hsjueftx.html