ML - hw1

作者: 谢小帅 | 来源:发表于2018-12-06 23:51 被阅读7次

    1. Machine Learning Problems

    (a) 1. BF,2. C,3. AD,4. G,5. AE,6. A,7. BF,8. AE,9. BG

    (b) False. Although a large number of data can train an excellent model working quite well in data resource, or training data, we focus more on the model performance on the test data or the model generalization ability.

    • Maximizing performance on the whole dataset may result in severe overfitting.
    • On the other hand, using all the data will consume more computation and time.

    2. Bayes Decision Rule

    maximum likelihood decision rule optimal bayes decision rule

    3. Gaussian Discriminant Analysis and MLE

    c.

    Gaussian Boundaries

    4. Text Classification with Naive Bayes

    a. 10 words

    ooking  9453
    voip    9494
    computron   13613
    nbsp    30033
    meds    37568
    pills   38176
    cialis  45153
    sex 56930
    php 65398
    viagra  75526
    

    b. accuracy = 0.9857315598548972

    c. False. When the ratio of spam and ham is 1:99, the spam filter can easily to find ham emails but may regard spam email as the ham email, too.

    d.
    precision = 0.9750223015165032
    recall = 0.9724199288256228

    e. Precision really matters. In this condition, it can find more spams.

    To identify drugs and bombs at an airport, I think the recall is more important, because we must find all the bombs to make sure the safety.

    相关文章

      网友评论

          本文标题:ML - hw1

          本文链接:https://www.haomeiwen.com/subject/tvkrcqtx.html