美文网首页
multiple_linear Regression with

multiple_linear Regression with

作者: NextStepPeng | 来源:发表于2017-10-28 17:39 被阅读0次

注意python版本,目前还在坑里,

10月29日,OK 搞定了将Spyder编译器改成3.2.3,Python版本是3.62,问题解决

# -*- coding: utf-8 -*-

"""

Spyder Editor

This is a temporary script file.

"""

import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

dataset = pd.read_csv('50_Startups.csv')

X = dataset.iloc[:, :-1].values

Y = dataset.iloc[:,4].values

from sklearn.preprocessing import LabelEncoder, OneHotEncoder

labelencoder_x = LabelEncoder()

X[:, 3] = labelencoder_x.fit_transform(X[:,3])

onehotencoder = OneHotEncoder(categorical_features = [3])

X =  onehotencoder.fit_transform(X).toarray()

X = X[:, 1:]

from sklearn.cross_validation import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.2, random_state = 0)

from sklearn.linear_model import LinearRegression

regressor = LinearRegression()

regressor.fit(X_train,y_train)

#查看结果

y_pre = regressor.predict(X_test)

#用后退梯度

import statsmodels.formula.api as sm

#axis = 1 最右边

X = np.append(arr = np.ones((50,1)).astype(int),values = X ,axis = 1)

X_opt = X[:,[0,1,2,3,4,5]]

#ALL in

regressor_OLS = sm.OLS(endog = Y, exog = X_opt).fit()

regressor_OLS.summary()

All in

X_opt = X[:,[0,1,3,4,5]]

#梯度递减

regressor_OLS = sm.OLS(endog = Y, exog = X_opt).fit()

regressor_OLS.summary()

删除state2

X_opt = X[:,[0,3,4,5]]

#梯度递减

regressor_OLS = sm.OLS(endog = Y, exog = X_opt).fit()

regressor_OLS.summary()

删除State3

X_opt = X[:,[0,3,5]]

#梯度递减

regressor_OLS = sm.OLS(endog = Y, exog = X_opt).fit()

regressor_OLS.summary()

继续删除偏差最大,影响最小

X_opt = X[:,[0,3]]

#梯度递减

regressor_OLS = sm.OLS(endog = Y, exog = X_opt).fit()

regressor_OLS.summary()

相关文章

网友评论

      本文标题:multiple_linear Regression with

      本文链接:https://www.haomeiwen.com/subject/aedspxtx.html