美文网首页
hmmlearn使用简介

hmmlearn使用简介

作者: esctrionsit | 来源:发表于2020-05-10 16:18 被阅读0次

    隐含马尔可夫模型(Hidden Markov Model,HMM)最初是在20世纪60年代后半期,由Leonard E. Baum和其他一些作者在一系列统计学论文中描述的。其最初应用于语音识别领域。

    1980年代后半期,HMM开始应用到生物序列,尤其是DNA序列的分析中。随后,在生物信息学领域,HMM逐渐成为一项不可或缺的技术。

    本文内容包含来自:
    [1] 用hmmlearn学习隐马尔科夫模型HMM
    [2] 官方文档

    hmmlearn

    hmmlearn曾经是scikit-learn项目的一部分,现已独立成单独的Python包,可直接通过pip进行安装,为无监督隐马尔可夫模型。其官方文档网址为https://hmmlearn.readthedocs.io/en/stable/。其有监督的版本为seqlearn。

    pip3 install hmmlearn
    

    hmmlearn提供三种模型:

    名称 简介 观测状态
    hmm.GaussianHMM Hidden Markov Model with Gaussian emissions. 连续
    hmm.GMMHMM Hidden Markov Model with Gaussian mixture emissions. 连续
    hmm.MultinomialHMM Hidden Markov Model with multinomial (discrete) emissions 离散

    MultinomialHMM

    方法声明为

    class hmmlearn.hmm.MultinomialHMM(n_components=1, startprob_prior=1.0, transmat_prior=1.0,
    algorithm='viterbi', random_state=None, n_iter=10, tol=0.01, verbose=False,  params='ste', init_params='ste')
    

    其中,较为常用(或将更新)的参数为:

    • n_components:(int)隐含状态个数
    • n_iter:(int, optional)训练时循环(迭代)最大次数
    • tol:(float, optional)Convergence threshold. EM will stop if the gain in log-likelihood is below this value.
    • verbose:(bool, optional)赋值为True时,会向标准输出输出每次迭代的概率(score)与本次
    • init_params:(string, optional)决定哪些参数会在训练时被初始化。‘s’ for startprob, ‘t’ for transmat, ‘e’ for emissionprob。空字符串""代表全部使用用户提供的参数进行训练。

    定义、使用:

    import numpy as np
    from hmmlearn import hmm
    
    states = ["box 1", "box 2", "box3"]
    n_states = len(states)
    
    observations = ["red", "white"]
    n_observations = len(observations)
    
    start_probability = np.array([0.2, 0.4, 0.4])
    
    transition_probability = np.array([
      [0.5, 0.2, 0.3],
      [0.3, 0.5, 0.2],
      [0.2, 0.3, 0.5]
    ])
    
    emission_probability = np.array([
      [0.5, 0.5],
      [0.4, 0.6],
      [0.7, 0.3]
    ])
    
    model = hmm.MultinomialHMM(n_components=n_states, n_iter=20, tol=0.001)
    model.startprob_=start_probability
    model.transmat_=transition_probability
    model.emissionprob_=emission_probability
    

    维特比算法预测状态

    有说法称,其返回结果为ln(prob),文档原文为“the log probability”

    seen = np.array([[0,1,0]]).T
    logprob, box = model.decode(seen, algorithm="viterbi")
    print("The ball picked:", ", ".join(map(lambda x: observations[x], seen)))
    print("The hidden box", ", ".join(map(lambda x: states[x], box)))
    

    输出为

    ('The ball picked:', 'red, white, red')
    ('The hidden box', 'box3, box3, box3')
    

    计算观测的概率

    print model.score(seen)
    

    输出为

    -2.03854530992
    

    训练与数据准备

    import numpy as np
    from hmmlearn import hmm
    
    states = ["box 1", "box 2", "box3"]
    n_states = len(states)
    
    observations = ["red", "white"]
    n_observations = len(observations)
    model = hmm.MultinomialHMM(n_components=n_states, n_iter=20, tol=0.01)
    
    D1 = [[1], [0], [0], [0], [1], [1], [1]]
    D2 = [[1], [0], [0], [0], [1], [1], [1], [0], [1], [1]]
    D3 = [[1], [0], [0]]
    
    X = numpy.concatenate([D1, D2, D3])
    
    model.fit(X)
    print model.startprob_
    print model.transmat_
    print model.emissionprob_
    print model.score(X)
    

    相关文章

      网友评论

          本文标题:hmmlearn使用简介

          本文链接:https://www.haomeiwen.com/subject/owjvcqtx.html