美文网首页
第二章 回归问题 笔记

第二章 回归问题 笔记

作者: 晨光523152 | 来源:发表于2019-12-02 21:45 被阅读0次

2.1 神经元模型

每个神经元通过树突获取输入信号,通过轴突传递输出信号,大量的神经元相互连接构建了巨大的神经网络,从而形成了人脑的感知和意思基础。

通过仿照生物神经元,心理学家 Warren McCulloch 和数理逻辑学家 Walter Pitters 于1943年提出了模拟生物神经元机制的人工神经网络的数学模型。这一成果后来被美国神经学家 Frank Rosenblatta 进一步发展成感知机模型,这也是现代深度学习的基础。

假设神经元输入向量\mathbf{x}=[x_{1},x_{2},x_{3},...x_{n}]^{T},想通过线性变换得到输出\mathbf{y},即f(\mathbf{x})=\mathbf{w}^{T}\mathbf{x} + \mathbf{b}

但是由于得到观测数据也就是向量\mathbf{x}总是会有误差的,不能用一个确定的直线(确定的\mathbf{w},\mathbf{b})穿过所有观测点。

所以需要用一个好的参数,尽可能的穿过所有观测数据,使得f(\mathbf{x})y的差别尽可能的小(误差尽可能的小)。

使用均方误差来衡量回归问题的误差,如下公式所示:
L = \frac{1}{n}\sum_{i=1}^{n}(y_{i} - (w_{i}x_{i} + b))^{2}\;\;\;\;\;(1)
其中,n是指观测数据的个数。

通过搜索一组确定的\mathbf{w^{*}},\mathbf{b^{*}},来使得均方误差最小,相当于求一个最小化问题,如下所示:
\arg\min_{w,b} \frac{1}{n}\sum_{i=1}^{n}(y_{i} - (w_{i}x_{i} + b))^{2} \;\;\;\;\;(2)

2.2 优化方法

使用梯度下降法来优化。
因为负梯度方向是下降最快的方向(原因我忘了,最优化方法里学过)。
通过对公式(1)求导,得到迭代公式如下:
\begin{split} w^{'} &= w- \eta\frac{\partial L}{\partial w}\\ b^{'} &= b - \eta \frac{\partial L}{\partial b} \;\;\;\;\;(3) \end{split}

2.3 线性模型实战

假定已知真实模型为:
y = 1.477 * x + 0.089 \;\;\;\;\;(4)

2.3.1 采样数据

为了模拟真实样本的观测误差,给模型添加误差自变量\epsilon(服从均值为0,方差为0.01的高斯分布)。通过随机采样100次,获得训练样本

data = []
for i in range(100):
    x = np.random.uniform(-10, 10)
    eps = np.random.normal(0, 0.01)
    y = 1.477 * x + 0.089 + eps
    data.append([x,y])
data = np.array(data)
print(data)

看看数据啥样

[[ -9.60021602 -14.0942134 ]
 [ -0.37232041  -0.46292594]
 [  2.63889539   3.98211472]
 [ -2.20995294  -3.1684925 ]
 [ -0.04546845   0.01545368]
 [  0.12634349   0.2684818 ]
 [ -7.60430995 -11.14302551]
 [ -8.49308202 -12.46678967]
 [ -0.66275991  -0.88510424]
 [  3.81511044   5.71470414]
 [ -3.90845005  -5.68882557]
 [  3.89991873   5.85766212]
 [ -3.04443718  -4.41259812]
 [ -9.45421482 -13.87017124]
 [ -7.64294932 -11.20542922]
 [  6.87074876  10.23594637]
 [  7.70975544  11.49358153]
 [  8.77654607  13.04775353]
 [  2.33436715   3.53505862]
 [  9.8611767   14.65773777]
 [  9.15552886  13.60432713]
 [  6.9884293   10.4193793 ]
 [  5.48104386   8.17575369]
 [  5.25800232   7.85917402]
 [ -5.66851096  -8.28304106]
 [ -6.02664144  -8.80209278]
 [  7.47792335  11.13963929]
 [ -2.16694491  -3.12553714]
 [  0.94759025   1.49706415]
 [  3.61814506   5.44155154]
 [ -4.19255218  -6.10946919]
 [  8.40233352  12.49780604]
 [  2.3394856    3.52594169]
 [ -7.95670992 -11.65781222]
 [ -4.43651236  -6.46428767]
 [ -2.40778135  -3.47043792]
 [  1.61352247   2.47239909]
 [ -3.2139131   -4.67600746]
 [  0.24342091   0.44741657]
 [  7.65956519  11.3957219 ]
 [ -5.38876103  -7.86917344]
 [ -0.11754912  -0.075521  ]
 [ -6.62637309  -9.69615214]
 [  3.81278393   5.70705208]
 [ -2.43880832  -3.51337828]
 [  2.76936795   4.18182553]
 [  8.30619718  12.35679355]
 [  4.4054764    6.58460358]
 [ -0.02704708   0.0570599 ]
 [ -7.67451939 -11.23781726]
 [ -6.58862003  -9.63824781]
 [  5.17661677   7.73576713]
 [  9.45833855  14.06221877]
 [  9.27977255  13.8017235 ]
 [  0.14489541   0.30111149]
 [ -4.47763542  -6.51811142]
 [ -5.22072928  -7.63148031]
 [ -4.13325398  -6.00456797]
 [ -1.43202328  -2.01357256]
 [  0.67086219   1.10214829]
 [ -9.95138599 -14.58170367]
 [ -3.76888222  -5.49762671]
 [  3.02976856   4.56590306]
 [  9.02150487  13.41138176]
 [ -0.97422585  -1.36951624]
 [ -7.38654031 -10.80180177]
 [  0.59919152   0.97292905]
 [  2.42907512   3.68452982]
 [ -4.34011924  -6.29745307]
 [ -4.12638217  -6.00109115]
 [ -6.059121    -8.86941249]
 [  6.86453278  10.23190486]
 [  5.58938332   8.3482495 ]
 [ -5.03995397  -7.33498458]
 [  7.63521956  11.36450436]
 [ -7.06719594 -10.33646325]
 [  0.87190085   1.37831486]
 [ -0.79539418  -1.07751784]
 [ -1.68791378  -2.39635162]
 [ -6.89121246 -10.09747847]
 [ -4.96998712  -7.27386395]
 [  9.91577257  14.7167431 ]
 [  3.76105909   5.64608384]
 [  5.95351563   8.88105529]
 [  8.78018603  13.05369905]
 [ -9.63152758 -14.14743805]
 [  8.55143821  12.72273024]
 [  9.9249124   14.74658578]
 [ -1.47758889  -2.08876147]
 [ -4.76012657  -6.94655042]
 [  5.47545608   8.17698337]
 [ -1.11637075  -1.56822168]
 [  9.70595979  14.41497375]
 [  3.41749449   5.14023928]
 [ -0.1841253   -0.17868807]
 [  7.55375949  11.23450473]
 [ -0.95492939  -1.31760662]
 [ -7.5403792  -11.04225878]
 [ -3.35376736  -4.87748193]
 [ -0.59707783  -0.78675649]]

2.3.2 计算误差

均方误差的计算

def mse(b, w, points):
    totalerror = 0
    for i in range(len(points)):
        x = points[i,0]
        y = points[i,1]
        totalerror += (y - (w * x + b))**2
    return totalerror / len(points)

2.3.3 计算梯度和梯度更新

梯度计算如下:
\begin{split} \frac{\partial L}{\partial w} &= \frac{2}{n}\sum_{i=1}^{n}(wx_{i}+b)x_{i}-x_{i}y_{i}=\frac{2}{n}\sum_{i=1}^{n}x_{i}(wx_{i}+b - y_{i})\\ \frac{\partial L}{\partial b} &= \frac{2}{n}\sum_{i=1}^{n}wx_{i} + b - y_{i}\;\;\;\;\;(5) \end{split}
根据公式(5)写出代码:

def step_gradient(b_current, w_current, points, lr):
    b_grident = 0
    w_grident = 0
    for i in range(len(points)):
        w_grident += 2 * points[i,0] * ((w_current * points[i,0] + b_current) 
                         -  points[i,1]) / len(points)
        b_grident += 2 * (w_current * points[i,0] + b_current - points[i,1]) / len(points)
    new_w = w_current - lr * w_grident
    new_b = b_current - lr * b_grident
    return [new_w, new_b]
def grident_descent(points, starting_b, starting_w, lr, epochs):
    b = starting_b
    w = starting_w
    for i in range(epochs):
        w, b = step_gradient(b_current=b,w_current=w,lr=lr,points=np.array(points))
        loss_mse = mse(b=b,w=w,points=points)
        if i%50 == 0:
            print(f"iteration:{i}, loss:{loss_mse}, w:{w}, b:{b}")
    return [w,b]

主函数

def main():
    lr = 0.01
    initial_w = 0
    initial_b = 0
    w,b=grident_descent(data,initial_b,initial_w,lr,1000)
    loss = mse(b,w,data)
    print(f'Final loss:{loss}, w:{w}, b:{b}')

运行看看结果:

iteration:0, loss:8.506605903378881, w:0.9682144436357752, b:0.01135859536407897
iteration:50, loss:0.0007912403463238857, w:1.4770993778407528, b:0.06266569076842714
iteration:100, loss:0.00018243254806051259, w:1.4769274409497026, b:0.07951401064065969
iteration:150, loss:0.00010114583470035741, w:1.4768646150454148, b:0.08567040282174254
iteration:200, loss:9.029260682163581e-05, w:1.4768416583999533, b:0.08791995440758126
iteration:250, loss:8.884350707927863e-05, w:1.4768332700197653, b:0.08874194270665337
iteration:300, loss:8.865002638516676e-05, w:1.4768302048976287, b:0.08904229801660268
iteration:350, loss:8.862419325725976e-05, w:1.476829084899071, b:0.08915204813386905
iteration:400, loss:8.862074407334759e-05, w:1.4768286756505278, b:0.08919215093159227
iteration:450, loss:8.862028354569901e-05, w:1.476828526110719, b:0.08920680453220137
iteration:500, loss:8.862022205703457e-05, w:1.47682847146873, b:0.08921215897186645
iteration:550, loss:8.86202138471997e-05, w:1.4768284515024952, b:0.08921411548923665
iteration:600, loss:8.862021275103868e-05, w:1.4768284442068138, b:0.08921483040255332
iteration:650, loss:8.862021260468194e-05, w:1.4768284415409647, b:0.08921509163256236
iteration:700, loss:8.862021258514166e-05, w:1.476828440566861, b:0.08921518708625732
iteration:750, loss:8.862021258253238e-05, w:1.4768284402109226, b:0.0892152219651287
iteration:800, loss:8.862021258218244e-05, w:1.4768284400808622, b:0.08921523470990159
iteration:850, loss:8.862021258213605e-05, w:1.4768284400333382, b:0.08921523936685373
iteration:900, loss:8.862021258213079e-05, w:1.4768284400159728, b:0.08921524106850857
iteration:950, loss:8.862021258212945e-05, w:1.4768284400096274, b:0.08921524169029488
Final loss:8.862021258213061e-05, w:1.4768284400073362, b:0.08921524191483535

参考资料:https://github.com/dragen1860/Deep-Learning-with-TensorFlow-book

相关文章

网友评论

      本文标题:第二章 回归问题 笔记

      本文链接:https://www.haomeiwen.com/subject/bksfgctx.html