Python 机器学习算法一之线性回归的推导及实战

作者: 烟雨丿丶蓝 | 来源:发表于2019-06-15 13:36 被阅读7次

Python 机器学习算法一之线性回归的推导及实战
Python 机器学习算法一之线性回归的推导及实战！
Python编写线性回归算法
2019-10-29
K-Means算法
《机器学习（周志华）》学习笔记（三）
回归算法学习
机器学习算法的优缺点
线性回归模型
线性回归

线性回归是机器学习中最基本的算法了，一般要学习机器学习都要从线性回归开始讲起，本节就对线性回归做一个详细的解释。

实例引入

在讲解线性回归之前，我们首先引入一个实例，张三、李四、王五、赵六都要贷款了，贷款时银行调查了他们的月薪和住房面积等因素，月薪越高，住房面积越大，可贷款金额越多，下面列出来了他们四个人的工资情况、住房面积和可贷款金额的具体情况：

姓名	工资(元)	房屋面积(平方)	可贷款金额(元)
张三	6000	58	30000
李四	9000	77	55010
王五	11000	89	73542
赵六	15000	54	63201

看到了这样的数据，又来了一位孙七，他工资是 12000 元，房屋面积是 60 平，那他大约能贷款多少呢？

在这推荐下小编创建的Python学习交流群835017344，可以获取Python入门基础教程，送给每一位小伙伴，这里是小白聚集地，每天还会直播和大家交流分享经验哦，欢迎初学和进阶中的小伙伴。

思路探索

那这时候应该往哪方面考虑呢？如果我们假定可贷款金额和工资、房屋面积都是线性相关的，要解决这个问题，首先我们想到的应该就是初高中所学的一次函数吧，它的一般表达方式是 y=wx+b<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>y</mi><mo>=</mo><mi>w</mi><mi>x</mi><mo>+</mo><mi>b</mi></math>，x<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>x</mi></math> 就是自变量，y<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>y</mi></math> 就是因变量，w<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>w</mi></math> 是自变量的系数，b<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>b</mi></math> 是偏移量，这个式子表明 y<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>y</mi></math>和 x<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>x</mi></math> 是线性相关的，y<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>y</mi></math> 会随着 x<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>x</mi></math> 的变化而呈现线性变化。

现在回到我们的问题中，情况稍微不太一样，这个例子中是可贷款金额会随着工资和房屋面积而呈现线性变化，此时如果我们将工资定义为 x1<math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>x</mi><mn>1</mn></msub></math>，房屋面积定义为 x2<math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>x</mi><mn>2</mn></msub></math>，可贷款金额定义为 y<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>y</mi></math>，那么它们三者的关系就可以表示为： y=w1x1+w2x2+b<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>y</mi><mo>=</mo><msub><mi>w</mi><mn>1</mn></msub><msub><mi>x</mi><mn>1</mn></msub><mo>+</mo><msub><mi>w</mi><mn>2</mn></msub><msub><mi>x</mi><mn>2</mn></msub><mo>+</mo><mi>b</mi></math>，这里的自变量就不再是一个了，而是两个，分别是 x1<math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>x</mi><mn>1</mn></msub></math> 和 x2<math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>x</mi><mn>2</mn></msub></math>，自变量系数就表示为了 w1<math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>w</mi><mn>1</mn></msub></math> 和 w2<math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>w</mi><mn>2</mn></msub></math>，我们将其转化为表达的形式，同时将变量的名字换一下，就成了这个样子：

h_{theta}(x) = theta_0 + theta_1x_1 + theta_2x_2

这里只不过是将原表达式转为函数形式，换了个表示名字，另外参数名称从 w<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>w</mi></math> 换成了 theta<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><mi>a</mi></math>，b<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>b</mi></math> 换成了 theta0<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><msub><mi>a</mi><mn>0</mn></msub></math>，为什么要换？因为在机器学习算法中 theta<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><mi>a</mi></math> 用的更广泛一些，约定俗成。

然后这个问题怎么解？我们只需要求得一组近似的 theta<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><mi>a</mi></math> 参数使得我们的函数可以拟合已有的数据，然后整个函数表达式就可以表示出来了，然后再将孙七的工资和房屋面积代入进去，就求出来他可以贷款的金额了。

思路拓展

那假如此时情景变一变，变得更复杂一些，可贷款金额不仅仅和工资、房屋面积有关，还有当前存款数、年龄等等因素有关，那我们的表达式应该怎么写？不用担心，我们有几个影响因素，就写定义几个变量，比如我们可以将存款数定义为 x3<math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>x</mi><mn>3</mn></msub></math>，年龄定义为 x4<math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>x</mi><mn>4</mn></msub></math>，如果还有其他影响因素我们可以继续接着定义，如果一共有 n<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>n</mi></math> 个影响因素，我们就定义到 xn<math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>x</mi><mi>n</mi></msub></math>，这时候函数表达式就可以变成这样子了：

h_{theta}(x) = theta_0 + theta_1x_1 + theta_2x_2 + … + theta_nx_n

这个式子看起来挺长的不好写的吧，我们可以使用求和公式写成如下形式：

h_{\theta}(x) = \sum_{i=0}^{n}\theta_ix_i = \theta^Tx

如果要使得这个公式成立，这里需要满足一个条件就是 x0=1<math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>x</mi><mn>0</mn></msub><mo>=</mo><mn>1</mn></math>，其实在实际场景中 x0<math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>x</mi><mn>0</mn></msub></math> 是不存在的，因为第一个影响因素我们用 x1<math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>x</mi><mn>1</mn></msub></math> 来表示了，第二个影响因素我们用 x2<math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>x</mi><mn>2</mn></msub></math> 来表示了，依次类推。所以这里我们直接指定 x0=1<math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>x</mi><mn>0</mn></msub><mo>=</mo><mn>1</mn></math> 即可。

后来我们又将公式简化为线性代数的向量表示，这里 thetaT<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><msup><mi>a</mi><mi>T</mi></msup></math> 是 theta<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><mi>a</mi></math> 向量转置的结果，而 theta<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><mi>a</mi></math> 向量又表示为 (theta0,theta1,…,thetan)<math xmlns="http://www.w3.org/1998/Math/MathML"><mo stretchy="false">(</mo><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><msub><mi>a</mi><mn>0</mn></msub><mo>,</mo><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><msub><mi>a</mi><mn>1</mn></msub><mo>,</mo><mo>…</mo><mo>,</mo><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><msub><mi>a</mi><mi>n</mi></msub><mo stretchy="false">)</mo></math>，同样地，x<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>x</mi></math> 向量可以表示为 (x0,x1,…,xn)<math xmlns="http://www.w3.org/1998/Math/MathML"><mo stretchy="false">(</mo><msub><mi>x</mi><mn>0</mn></msub><mo>,</mo><msub><mi>x</mi><mn>1</mn></msub><mo>,</mo><mo>…</mo><mo>,</mo><msub><mi>x</mi><mi>n</mi></msub><mo stretchy="false">)</mo></math>，总之，表达成后面的式子，看起来更简洁明了。

好了，这就是最基本的线性判别解析函数的写法，是不是很简单。

实际求解

那接下来我们怎样实际求解这个问题呢？比如拿张三的数据代入到这个函数表达式中，这里还是假设有两个影响因素，张三的数据我们可以表示为 x(1)1=6000,x(1)2=58,y(1)=30000<math xmlns="http://www.w3.org/1998/Math/MathML"><msubsup><mi>x</mi><mn>1</mn><mrow class="MJX-TeXAtom-ORD"><mo stretchy="false">(</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msubsup><mo>=</mo><mn>6000</mn><mo>,</mo><msubsup><mi>x</mi><mn>2</mn><mrow class="MJX-TeXAtom-ORD"><mo stretchy="false">(</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msubsup><mo>=</mo><mn>58</mn><mo>,</mo><msup><mi>y</mi><mrow class="MJX-TeXAtom-ORD"><mo stretchy="false">(</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msup><mo>=</mo><mn>30000</mn></math>，注意这里我们在数据的右上角加了一个小括号，里面带有数字，如果我们把张三的数据看成一个条目，那么这个数字就代表了这个条目的序号，1 就代表第一条数据，2 就代表第二条数据，为啥这么写？也是约定俗成，以后也会经常采用这样的写法，记住就好了。

所以，我们的愿景是要使得我们的函数能够拟合当前的这条数据，所以我们希望是这样的情况：

y^{(1)} = sum {i=0}^{n}theta {i}x_{i}^{(1)} = theta^Tx{(1)}

其中 y(1)<math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mi>y</mi><mrow class="MJX-TeXAtom-ORD"><mo stretchy="false">(</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msup></math> 是真实值，但我们知道，哪有那么容易十全十美，丝毫不差的拟合函数，所以上面的式子一般来说是不成立的，函数计算值和真实值还是多少还是有一定的误差的吧，如果我们想知道函数真实拟合的结果的话，我们需要把 x<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>x</mi></math> 变量代入函数，会得到一个函数本身拟合的结果，是这样的：

h_{\theta}(x^{(1)}) = \sum_{i=0}^{{n}\theta_{i}x_{i}}{(1)} = \theta^Tx{(1)}

这里的 htheta(x(1))<math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>h</mi><mrow class="MJX-TeXAtom-ORD"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><mi>a</mi></mrow></msub><mo stretchy="false">(</mo><msup><mi>x</mi><mrow class="MJX-TeXAtom-ORD"><mo stretchy="false">(</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msup><mo stretchy="false">)</mo></math> 就是我们函数得到的结果了。一般来说，二者是会存在一定的误差的，所以误差我们一般可以写成他们的差的绝对值或平方的形式，使用绝对值或平方的目的是消去正负号的影响，比如写成平方的形式就是这样子：

(h_{\theta}(x^{(1)}) – y^{(1)})2

这个式子就是我们函数真实拟合值和真实值之间的差距，没问题吧。

相应的，如果是李四的数据，误差就可以写为：

(h_{\theta}(x^{(2)}) – y^{(2)})2

依次类推，如果是第 i<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>i</mi></math> 条数据，误差就可以写成：

(h_{\theta}(x^{(i)}) – y^{(i)})2

要使得我们的函数对所有的数据都能尽量很好地拟合，我们可以把这些误差加起来求个平均，假设一共 m<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi></math> 条数据，那么所有数据的误差可以写成如下形式：

J(\theta) = \dfrac{1}{2m}\sum_{i=1}^{{m}(h_\theta(x}{(i)}) – y^{(i)})2

注意这里 i<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>i</mi></math> 指的是第几条数据，是从 1 开始的，一直到 m<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi></math> 为止，然后使用了求和公式对每一条数据的误差进行累加和，最后除以了 2m<math xmlns="http://www.w3.org/1998/Math/MathML"><mn>2</mn><mi>m</mi></math>，我们的最终目的就是找出合适的 theta<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><mi>a</mi></math>，使得这个 J(theta)<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>J</mi><mo stretchy="false">(</mo><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><mi>a</mi><mo stretchy="false">)</mo></math> 的值最小，即误差最小，在机器学习中，我们就把 J(theta)<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>J</mi><mo stretchy="false">(</mo><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><mi>a</mi><mo stretchy="false">)</mo></math> 称为损失函数（Loss Function），即我们要使得损失值最小。

有的小伙伴可能好奇损失函数前面为什么是 2m<math xmlns="http://www.w3.org/1998/Math/MathML"><mn>2</mn><mi>m</mi></math>，而不是 m<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi></math>？因为我们后面要用到这个算式的导数，所以这里多了个 2 是为了便于求导计算。况且一个表达式要求最小值，前面乘一个常数是对结果没影响的。

求解过程

由于我们求解的是线性回归问题，所以整个损失函数的图像非常简单清晰，如果只有 theta1<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><msub><mi>a</mi><mn>1</mn></msub></math> 和 theta2<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><msub><mi>a</mi><mn>2</mn></msub></math> 两个参数，我们甚至可以直接画出其图像，整个损失函数大小随 theta1<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><msub><mi>a</mi><mn>1</mn></msub></math> 和 theta2<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><msub><mi>a</mi><mn>2</mn></msub></math> 的变化实际上类似于这样子：

image

可以看到这是一个凸函数，竖轴代表损失函数的大小，横纵两轴代表 theta1<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><msub><mi>a</mi><mn>1</mn></msub></math> 和 theta2<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><msub><mi>a</mi><mn>2</mn></msub></math> 的变化，可见在中间的最低谷损失函数取得最小值，这时候损失函数在 theta1<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><msub><mi>a</mi><mn>1</mn></msub></math> 和 theta2<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><msub><mi>a</mi><mn>2</mn></msub></math> 上的导数都是 0，因此我们可以一步到位，直接用偏导置零的方式来求解损失函数取得最小值时的 theta<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><mi>a</mi></math> 值的大小。

所以我们可以先对每个 theta<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><mi>a</mi></math> 求解其偏导结果，这里 theta<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><mi>a</mi></math> 表示为 thetaj<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><msub><mi>a</mi><mi>j</mi></msub></math>，代表 theta<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><mi>a</mi></math> 中的某一维：

dfrac{partial{J(theta)}}{partial{theta j}} = dfrac{1}{2m} dfrac{partial({sum {i=1}^{m}{(y{(i)} – h_{theta}(x^{(i)}))2}})}{partial{theta

j}} \

{i=1}^{{m}((h_{theta}(x}{(i)}) – y^{(i)})x_j{(i)})

直接将偏导置零即可直接求解 $ theta

j $：

{i=1}^{m} {h_{theta}(x^{(i)})x j^{(i)}} – sum {i=1}^{m}y{(i)}x_j^{(i)} = 0

这里将所有数据代入，即可通过求解方程组的形式直接解出 thetaj<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><msub><mi>a</mi><mi>j</mi></msub></math> 的值，但这些方程组里面其实 thetaj<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><msub><mi>a</mi><mi>j</mi></msub></math> 之间存在彼此依赖关系，需要联立求解出来。

如果不用这种求解方式，我们可以使用梯度下降的方式来进行求解，在这里 thetaj<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><msub><mi>a</mi><mi>j</mi></msub></math> 只需要逐步更新即可：

theta_j = theta_j – alphadfrac{partial{J(theta)}}{partial{theta_j}} \

= theta j – dfrac{alpha}{m}sum {i=1}^{{m}((h_{theta}(x}{(i)}) – y^{(i)})x_j{(i)})

这里 alpha<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>a</mi><mi>l</mi><mi>p</mi><mi>h</mi><mi>a</mi></math> 就是学习率，thetaj<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><msub><mi>a</mi><mi>j</mi></msub></math> 每经过一步都会进行一次更新，得到新的结果，经过梯度下降过程，thetaj<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><msub><mi>a</mi><mi>j</mi></msub></math> 都会更新为使得梯度最小化的数值，最后就完成了 theta<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><mi>a</mi></math> 的求解。

以上便是线性回归的整个推导和求解过程。

实战操作

现在呢，我们想要根据前面的数据来求解这个真实的问题，为了解决这个问题，我们在这里用 Python 的 Sklearn 库来实现。

对于线性回归来说，Sklearn 已经做好了封装，直接使用 LinearRegression 即可。

它的 API 如下：

<pre class="prettyprint hljs haskell" style="padding: 0.5em; font-family: Menlo, Monaco, Consolas, "Courier New", monospace; color: rgb(68, 68, 68); border-radius: 4px; display: block; margin: 0px 0px 1.5em; font-size: 14px; line-height: 1.5em; word-break: break-all; overflow-wrap: break-word; white-space: pre; background-color: rgb(246, 246, 246); border: none; overflow-x: auto; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">class sklearn.linear_model.LinearRegression(fit_intercept=True, normalize=False, copy_X=True, n_jobs=None) </pre>

参数解释如下：

fit_intercept : 布尔值，是否使用偏置项，默认是 True。
normalize : 布尔值，是否启用归一化，默认是 False。当 fit_intercept 被置为 False 的时候，这个参数会被忽略。当该参数为 True 时，数据会被归一化处理。
copy_X : 布尔值，默认是 True，如果为 True，x 参数会被拷贝不会影响原来的值，否则会被复写。
n_jobs：数值或者布尔，如果设置了，则多核并行处理。

属性如下：

coef_：x 的权重系数大小
intercept_：偏置项大小

代码实现如下：

<pre class="prettyprint hljs vim" style="padding: 0.5em; font-family: Menlo, Monaco, Consolas, "Courier New", monospace; color: rgb(68, 68, 68); border-radius: 4px; display: block; margin: 0px 0px 1.5em; font-size: 14px; line-height: 1.5em; word-break: break-all; overflow-wrap: break-word; white-space: pre; background-color: rgb(246, 246, 246); border: none; overflow-x: auto; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">from sklearn.linear_model import LinearRegression

x_data = [
[6000, 58],
[9000, 77],
[11000, 89],
[15000, 54]
]
y_data = [
30000, 55010, 73542, 63201
]

lr = LinearRegression()
lr.fit(x_data, y_data)
print('方程为：y={w1}x1+{w2}x2+{b}'.format(w1=round(lr.coef_[0], 2),
w2=round(lr.coef_[1], 2),
b=lr.intercept_))
x_test = [[12000, 60]]
print('住房面积为：', lr.predict(x_test)[0])
</pre>

运行结果：

<pre class="prettyprint hljs" style="padding: 0.5em; font-family: Menlo, Monaco, Consolas, "Courier New", monospace; color: rgb(68, 68, 68); border-radius: 4px; display: block; margin: 0px 0px 1.5em; font-size: 14px; line-height: 1.5em; word-break: break-all; overflow-wrap: break-word; white-space: pre; background-color: rgb(246, 246, 246); border: none; overflow-x: auto; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">方程为：y=4.06x1+743.15x2+-37831.862532707615
住房面积为：55484.33779181102
</pre>

在这里我们首先声明了 LinearRegression 对象，然后将数据整合成 x_data 和 y_data 的形式，然后通过调用 fit() 方法来对数据进行拟合。

拟合完毕之后，LinearRegression 的 coef_ 对象就是各个 x 变量的权重大小，即对应着 $theta_1, theta *2$ ，intercept* 则是偏移量，对应着 theta0<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><msub><mi>a</mi><mn>0</mn></msub></math>，这样我们就可以得到一个线性回归表达式了。

然后我们再调用 predict() 方法，将新的测试数据传入，便可以得到其预测结果，最终结果为 55484.34，即孙七的可贷款额度为 55484.34 元。

以上便是机器学习中线性回归算法的推导解析和相关调用实现。

Python 机器学习算法一之线性回归的推导及实战
线性回归是机器学习中最基本的算法了，一般要学习机器学习都要从线性回归开始讲起，本节就对线性回归做一个详细的解释。 ...
Python 机器学习算法一之线性回归的推导及实战！
线性回归是机器学习中最基本的算法了，一般要学习机器学习都要从线性回归开始讲起，本节就对线性回归做一个详细的解释。 ...
Python编写线性回归算法
Python编写线性回归算法前言线性回归（Linear Regression）是机器学习的基础，作为机器学习算...
2019-10-29
Day2 简单线性回归模型机器学习入门--简单线性回归机器学习算法之线性回归算法导入库matplotlib 绘...
K-Means算法
参考链接：1. python机器学习实战之K均值聚类2. 机器学习实战之K-Means算法3.《机器学习实战》（十...
《机器学习（周志华）》学习笔记（三）
Q：机器学习中最简单的学习算法是什么？ A：最简单的机器学习算法莫过于线性回归算法了。线性回归算法的基本形式如下：...
回归算法学习
回归算法是机器学习中的经典算法之一，本文是对学习线性回归和逻辑回归算法进行的总结，线性回归与逻辑回归算法解决的分别...
机器学习算法的优缺点
机器学习算法的优缺点机器学习算法的优缺点线性回归 Linear Regression 逻辑回归 Logisti...
线性回归模型
参考：1.使用Python进行线性回归2.python机器学习：多元线性回归3.线性回归概念线性回归模型是线性模...
线性回归
线性回归是机器学习算法的入门，通过该算法，管中窥豹，研究该算法的精髓之处。线性回归线性回归的损失函数为最小二乘...