美文网首页
DS Interview Question--Missing V

DS Interview Question--Missing V

作者: Vivian有好多美好的故事 | 来源:发表于2017-06-28 13:03 被阅读0次

Q: During analysis, how do you treat missing values?

A: 

First, we need to know the pattern of missing data:1. Missing completely at random (MCAR): there is no pattern in the missing data on any variables. (The most and the best situation); 2. Missing at random (pattern not affect primary dependent variables);3. Missing not at random (pattern affect primary dependent variables)

And then we can choose different methods to deal with missing values:

Deletion: If we have enough observations and the missing data is random, we can delete the observations with missing values and don't introduce bias.

Imputation: 1. Replace missing values with mean/ median/ mode or set default value; 2. Replace missing data by building models(eg. Regression/ KNN, etc.)

Others: Complex methods like Multiple Imputation (MI), Hot Deck, etc.

Ignorance: Some models, like random forest, can deal with missing values by itself.

Interview questions are from DataAppLab (Wechat: Datalaus)

Jun.27th, 2017  Seattle

相关文章

网友评论

      本文标题:DS Interview Question--Missing V

      本文链接:https://www.haomeiwen.com/subject/vykucxtx.html