美文网首页
data analysis

data analysis

作者: 光_武 | 来源:发表于2018-02-09 00:03 被阅读12次

missing values

  1. visualizing missing values.


  2. imputing data set selection.(the picture is very important.)


  3. just few missing values by particular model



  4. predicting values based on other variables


model testing

  1. OOB error

The black line shows the overall error rate which falls below 20%. The red and green lines show the error rate for ‘died’ and ‘survived’ respectively.


  1. RMSE



    LASSO MODEL
    THE MOST POPULAR ALGORITHM

    THEN WE COULD COMPARE THEM BY RMSE AS FOLLOWS


variable importance

variable selecting

  1. Boruta Feature Importance Analysis
  2. Plotting all data
  3. Explore the correlation
  4. Plot scatter plot for variables that have high correlation.(the same link as 3th)



  5. some useful plot


相关文章

网友评论

      本文标题:data analysis

      本文链接:https://www.haomeiwen.com/subject/keixtftx.html