House Price Regression Process:S

House Price Regression Process:S

作者: DT数据说 | 来源:发表于2018-09-04 09:30 被阅读0次

    房屋销售价格回归预测的项目有很多人公开了其Kernel, 其中Serigne的“Stacked Regressions to predict House Prices”为多数人所阅读。读者可以在Kaggle网站上直接浏览。本文做了一些总结,把主要的流程步骤列表如下,读者可以厘清思路。

    Stacked Regressions to predict House Prices. 0

    Data Processing. 5

    Outliers. 5

    Note : 5

    Target Variable¶. 5

    Log-transformation of the target variable. 7

    Features engineering. 8

    Missing Data. 8

    Data Correlation. 8

    Imputing missing values. 8

    More features engeneering¶. 9


    some numerical variables that are really categorical 9

    Label Encoding some categorical variables

    that may contain information in their ordering set 9

    Adding one

    more important feature. 9


    features. 9


    dummy categorical features Getting the new train and test sets. 10

    Modelling. 10

    Import librairies. 10

    Define a cross validation strategy. 10

    Base models  10

    StackedRegressions  to predict House Prices. 0

    Data Processing. 5

    Outliers. 5

    Note : 5

    Target Variable¶. 5

    Log-transformation of the target variable. 7

    Features engineering. 8

    Missing Data. 8

    Data Correlation. 8

    Imputing missing values. 8

    More features engeneering . 9

    Transforming some numerical variables that are really categorical 9

    Label Encoding some

    categorical variables that may contain information in their ordering set 9

    Adding one more important feature. 9

    Skewed features. 9

    Getting dummy categorical features Getting the new train and test sets. 10

    Modelling. 10

    Import librairies. 10

    Define a cross validation strategy. 10

    Base models. 10

    LASSO Regression : 10

    Elastic Net Regression : 11

    Kernel Ridge Regression : 11

    Gradient Boosting Regression : 11

    XGBoost 11

    ·       LightGBM.. 11

    Base models scores. 11

    Stacking models. 11

    Simplest Stacking approach : Averaging base models. 11

    Averaged base models class. 11

    Averaged base models score. 11

    Less simple Stacking : Adding a

    Meta-model 12

    Stacking averaged Models Class. 13

    Stacking Averaged models Score. 13

    Ensembling StackedRegressor, XGBoost and LightGBM.. 13

    Final Training and Prediction. 13

    Stacked Regressor: 13

    XGBoost: 14

    Ensemble prediction: 15

    Submission. 15

    Comments. 16

    Leader Board Ranking: 17

    RMSLE score on train data: 0.07658856703780222  18



          本文标题:House Price Regression Process:S
