美文网首页想法简友广场
数据挖掘:朴素贝叶斯Naive Bayes Classifier

数据挖掘:朴素贝叶斯Naive Bayes Classifier

作者: Cache_wood | 来源:发表于2022-04-10 16:28 被阅读0次

    @[toc]

    A probabilistic framework for solving classification problems

    Conditional Probability:
    P(C|A) = \frac{P(A,C)}{P(A)}\\ P(A|C) = \frac{P(A,C)}{P(C)}
    Bayes theorem:
    P(C|A) = \frac{P(A|C)P(C)}{P(A)}
    Consider each attribute and class label as random variables

    Given a record with attributes (A_1,A_2,…,A_n)

    • Goal is to predict class C
    • Specifically, we want to find the value of C that maximizes P(C|A_1,A_2,…,A_n)

    Approach:

    • compute the posterior probability P(C|A_1,A_2,…,A_n) for all values of C using the Bayes theorem
      P(C|A_1A_2…A_n) = \frac{P(A_1A_2…A_n)P(C)}{P(A_1A_2…A_n)}

    • Choose value of C that maximizes
      P(C|A_1,A_2,…,A_n)

    • Equivalent to choosing value of C that maximizes
      P(A_1,A_2,…,A_n|C)P(C)

    Naive Bayes Classifier

    Assume independence among attributes A_i when class is given:

    • P(A_1,A_2,…,A_n|C_j) = P(A_1|C_j)P(A_2|C_j)…P(A_n|C_j)
    • Can estimate P(A_i|C_j) for all A_i and C_j.
    • New point is classified to C_j if P(C_j)\Pi P(A_i|C_j) is maximal.

    How to Estimate Probabilities from Data

    For continuous attributes:

    • Discretize the range into bins
      • one ordinal attribute per bin
      • violates independence assumption
    • Two-way split: (A<v) or (A>v)
      • cjoose only one of the two splits as new attribute
    • Probability density estimation
      • Assume attribute follows a normal distribution
      • Use data to estimate parameters of distribution(e.g., mean and standard deviation) b
      • Once probability distribution is known, can use it to estimate the conditional probability P(A_i|c)

    Normal distribution :P(A_i|c_j) = \frac{1}{\sqrt{2\pi\sigma_{ij}^2}}e^{-\frac{(A_i-\mu_{ij})^2}{2\sigma_{ij}^2}}

    One for each (A_i,c_i) pair

    If one of the conditional probability is zero, then the entire expression becomes zero

    Probability estimation:

    c :number of classes, p :prior probability, m :parameter
    Original: P(A_i|C) = \frac{N_{ic}}{N_c}\\ Laplace:P(A_i|C) = \frac{N_{ic}+1}{N_c+c}\\ m-estimate:P(A_i|C)= \frac{N_{ic}+mp}{N_c+m}\\

    Naive Bayes(Summary)

    Robust to isolated noise points.

    Handle missing values by ignoring the instance during probability estimate calculations

    Robust to irrelevant attributes

    Independence assumption may not hold for some attributes

    • Use other techniques such as Bayesian Belief Networks (BBN)

    相关文章

      网友评论

        本文标题:数据挖掘:朴素贝叶斯Naive Bayes Classifier

        本文链接:https://www.haomeiwen.com/subject/xshusrtx.html