美文网首页
贝叶斯决策理论

贝叶斯决策理论

作者: cccshuang | 来源:发表于2019-01-25 16:06 被阅读0次

Bayesian

Bayes's Theorem

P(A|B) = \frac{P(B|A)P(A)}{P(B)}
prior: P(\omega)
likelihood: P(x|\omega)
posterior: P(\omega_i|x) = \frac{P(x|\omega_i)P(\omega_i)}{P(x)} = \frac{P(x|\omega_i)P(\omega_i)}{\sum_{j=1}^k P(x|\omega_j)P(\omega_j)}

Optimal Bayes Decision Rule: minimize the probability of error.
    if P(\omega_1|x) > P(\omega_2|x) then True state of nature =\omega_1;
    if P(\omega_1|x) < P(\omega_2|x) then True state of nature =\omega_2.

Prove: For a particular x,
        P(error|x) = P(\omega_1|x) if we decide \omega_2;
        P(error|x) = P(\omega_2|x) if we decide \omega_1.
Bayes Decision Rule:Decide \omega_1 if P(\omega_1|x) > P(\omega_2|x);otherwise decide \omega_2.
Therefore: P(error|x) = min[P(\omega_1|x),P(\omega_2|x)].
The unconditional error P(error) obtained by integration over all P(error|x).

Bayesian Decision Theory

c state of nature: \{\omega_1,,\omega_2,\cdots,\omega_c\}
a possible actions: \{\alpha_1,\alpha_2,\cdots,\alpha_a\}
the loss for taking action \alpha_i when the true state of nature is \omega_j: \lambda(\alpha_i|\omega_j)
R(\alpha_i|x) = \sum_{j=1}^{c}\lambda(\alpha_i|\omega_j)P(\omega_j|x)
Select the action for which the conditional risk R(\alpha_i|x) is minimum.
Bayes Risk: R = \sum_{over x} R(\alpha_i|x).

  • Example 1:
    action \alpha_1: deciding \omega_1
    action \alpha_2: deciding \omega_2
    \lambda_{ij} = \lambda(\alpha_i|\omega_j)
    R(\alpha_1|x) = \lambda_{11}P(\omega_1|x) + \lambda_{12}P(\omega_2|x)
    R(\alpha_2|x) = \lambda_{21}P(\omega_1|x) + \lambda_{22}P(\omega_2|x)
    if R(\alpha_1|x) < R(\alpha_2|x) , action \alpha_1 is taken: deciding \omega_1.
  • Example 2:
    Suppose \lambda\left(\alpha_{i} | \omega_{j}\right)=\left\{\begin{array}{ll}{0} & {i=j} \\ {1} & {i \neq j}\end{array}\right.
    Conditional risk
    R\left(\alpha_{i} | x\right)=\sum_{j=1}^{c} \lambda\left(\alpha_{i} | \omega_{j}\right) P\left(\omega_{j} | x\right) =\sum_{j \neq i} P\left(\omega_{j} | x\right)=1-P\left(\omega_{i} | x\right)
    Minimizing the risk \longrightarrow Maximizing the posterior P(\omega_i|x).
    So we have the discriminant function(max. discriminant corresponds to min. risk):
    g_{i}(x)=-R\left(\alpha_{i} | x\right)
    \Longleftrightarrow
    g_{i}(x)=P\left(\omega_{i} | x\right)
    g_{i}(x)=P(x | \omega_{i}) P\left(\omega_{i}\right)
    g_{i}(x)=\ln P(x | \omega_{i})+\ln P\left(\omega_{i}\right)
    Set of discriminant functions: g_{i}(x), i=1, \cdots, c
    Classifier assigns a feature vector x to class \omega_i if: g_{i}(x)>g_{j}(x), \quad \forall j \neq i

Binary classification \longrightarrow Multi‐class classfication

  • One vs. One
    N class, design \frac{N(N-1)}{2} classifiers, denote for result.
  • One vs. Rest
    design N classifiers, choose the one which prediction is positive.
  • ECOC (Error‐Correcting Output Codes)
    The code consisting of the labels predicted by these classifiers is compared with each line, and the one with the smallest distance between codes is the result.
f1 f2 f3
c1 -1 1 -1
c2 1 -1 -1
c3 -1 1 1

相关文章

网友评论

      本文标题:贝叶斯决策理论

      本文链接:https://www.haomeiwen.com/subject/paxvjqtx.html