美文网首页
data analysis

data analysis

作者: 光_武 | 来源:发表于2018-02-09 00:03 被阅读12次

    missing values

    1. visualizing missing values.


    2. imputing data set selection.(the picture is very important.)


    3. just few missing values by particular model



    4. predicting values based on other variables


    model testing

    1. OOB error

    The black line shows the overall error rate which falls below 20%. The red and green lines show the error rate for ‘died’ and ‘survived’ respectively.


    1. RMSE



      LASSO MODEL
      THE MOST POPULAR ALGORITHM

      THEN WE COULD COMPARE THEM BY RMSE AS FOLLOWS


    variable importance

    variable selecting

    1. Boruta Feature Importance Analysis
    2. Plotting all data
    3. Explore the correlation
    4. Plot scatter plot for variables that have high correlation.(the same link as 3th)



    5. some useful plot


    相关文章

      网友评论

          本文标题:data analysis

          本文链接:https://www.haomeiwen.com/subject/keixtftx.html