美文网首页
KL Divergence

KL Divergence

作者: 小维帮倒忙zzzzz | 来源:发表于2017-10-26 10:46 被阅读0次

relative entropy
衡量两个概率分布的散度probability distributions diverges
for discrete probability distributions

image.png

for continuous random variable

image.png

从字面意思来看呢,是一种距离,但是实际上和我们理解的“距离”并不一样。我们常规理解的距离一般来说有几点性质:
1.非负:距离是绝对值,非负好理解。
2.对称:从A到B的距离 = 从B到A的距离
3.勾股定理:两边之和大于第三边
而KL的性质只满足第一点非负性,不满足对称性和勾股定理。

# KL divergence (and any other such measure) expects the input data to have a sum 1
1.import numpy as np
def KL(a, b): 
    a = np.array(a, dtype=np.float) 
    b = np.array(b, dtype=np.float) 
    return np.sum(np.where(a!=0, a*np.log(a/b), 0)) 
# np.log(a / (b + np.spacing(1))) np.spacing等价于inf
2. scipy.stats.entropy(pk, qk=None, base=None)
当qk != None时计算KL Divergence
automatically normalize x,y to have sum = 1

application:
text similarity, 先统计词频,然后计算kl divergence
用户画像

reference:
https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence
http://www.cnblogs.com/charlotte77/p/5392052.html

相关文章

网友评论

      本文标题:KL Divergence

      本文链接:https://www.haomeiwen.com/subject/nlbhpxtx.html