简单线性回归
Day 2的任务是简单线性回归. 开始任务~
Screen Shot 2019-01-08 at 1.37.34 PM.png
Step1 Data Preprocessing
Screen Shot 2019-01-08 at 1.37.51 PM.png首先我们import numpy, pandas, matplotlib. 使用pandas来read数据集. 使用sklearn来分配训练集和测试集. test_size为四分之一.
code如下:
#Step 1: Data Preprocessing
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
dataset = pd.read_csv('../datasets/studentscores.csv')
X = dataset.iloc[:, : 1].values
Y = dataset.iloc[:, 1 ].values
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 1/4, random_state = 0)
print('X_train')
print(X_train)
print('X_test')
print(X_test)
print('Y_train')
print(Y_train)
print('Y_test')
print(Y_test)
Step2 Linear Regression to train the model
Screen Shot 2019-01-08 at 1.37.59 PM.png线性回归来训练我们的数据集.
code如下:
#Step 2: LinearRegression
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor = regressor.fit(X_train, Y_train)
Step3 Prediction Outcome
Screen Shot 2019-01-08 at 1.38.06 PM.png我们可以使用predict来预测输出, 将输出保存到Y_pred中, 然后打印出来.
code如下:
#Step 3: Prediction Outcome
Y_pred = regressor.predict(X_test)
print('Y_pred')
print(Y_pred)
Step4 Visulization
Screen Shot 2019-01-08 at 1.38.11 PM.png最后一步, 我们使用matplotlib来可视化我们的结果. 这里我们把训练集结果和测试集结果散点图show出, 这样就能比对我们的模型的预测结果.
Day2_1.png Day2_2.png
code如下:
#Step 4: Visulization
plt.scatter(X_train, Y_train, color = 'red')
plt.plot(X_train, regressor.predict(X_train), color = 'blue')
plt.show()
plt.scatter(X_test, Y_test, color = 'red')
plt.plot(X_test, regressor.predict(X_test), color = 'blue')
plt.show()
网友评论