Quicken Loans 有两个系统:
LOLA: captures the system where people can record all the behavior that the lender does before underwriter.
LOLA 系统非常强大,每一个申请过房屋贷款的都会记录在案,例如说banker 打了多少次电话,顾客回复了多少次,
回复的频率是多少,顾客查询了多少次的proapproval document,在这个过程当中有没投诉,顾客申请初次贷款的金额产品,已经leads的情况是多少。有多少顾客在underwriting 之前就drop 掉的,概率是多少。
当进入underwriting 的状态(23) ,就会进入到AMP系统,这里主要是underwriting ,title source 的记录(check)到status clear (closing)
我们通过业务的梳理,基本上是做feature selection, 我们并没有用一个model 来做prediction ,而是通过不同的申请贷款过程中的不同阶段,挑选出非常关键的阶段:例如从mortgage banker 到preapproval letter, preapproval letter 到 criteria checking ,从underwriting 到title source, title source 到closing的几个阶段来做variable selection. 然后分别建立decision tree模型。
conversion rate prediction:
Regression with time series errors,
variable:
page views,
unique visitors
ten year treasury yield
- Variable Selections: From business side: the channeling mix and interest rate. Originally the channeling mix has 15 channels including leadby, social media etc. As for some channeling mix data, the leadby data are missing. So I narrow down to the variables into 7, 10 year treasury yields, direct search, paid search, online ads, email, affiliate networks, relationship marketing. This part uses multiple regression technique to do the variable selection with lowest AIC and pass the F-test and t-test for independent variables.
- Modeling Procedure:
• Check that the forecast variable and all predictors are stationary.
• Fit the regression model with AR(2) errors for non-seasonal data or ARIMA(2,0,0)(1,0,0)m errors for seasonal data. Since this model does not have any seasonality, so I fit the model with AR(2) first.
• Calculate the ARIMA errors (Nt) from the fitted regression model and identify an appropriate ARMA model for them. Also check the AIC value.
• Re-fit the entire model using the new ARMA model for the errors.
• Check that the et series looks like white noise.
• Pass the Ljung-Box test to show that the errors are uncorrelated
• Evaluation of the model.
网友评论