美文网首页
第二章 回归问题 笔记

第二章 回归问题 笔记

作者: 晨光523152 | 来源:发表于2019-12-02 21:45 被阅读0次

    2.1 神经元模型

    每个神经元通过树突获取输入信号,通过轴突传递输出信号,大量的神经元相互连接构建了巨大的神经网络,从而形成了人脑的感知和意思基础。

    通过仿照生物神经元,心理学家 Warren McCulloch 和数理逻辑学家 Walter Pitters 于1943年提出了模拟生物神经元机制的人工神经网络的数学模型。这一成果后来被美国神经学家 Frank Rosenblatta 进一步发展成感知机模型,这也是现代深度学习的基础。

    假设神经元输入向量\mathbf{x}=[x_{1},x_{2},x_{3},...x_{n}]^{T},想通过线性变换得到输出\mathbf{y},即f(\mathbf{x})=\mathbf{w}^{T}\mathbf{x} + \mathbf{b}

    但是由于得到观测数据也就是向量\mathbf{x}总是会有误差的,不能用一个确定的直线(确定的\mathbf{w},\mathbf{b})穿过所有观测点。

    所以需要用一个好的参数,尽可能的穿过所有观测数据,使得f(\mathbf{x})y的差别尽可能的小(误差尽可能的小)。

    使用均方误差来衡量回归问题的误差,如下公式所示:
    L = \frac{1}{n}\sum_{i=1}^{n}(y_{i} - (w_{i}x_{i} + b))^{2}\;\;\;\;\;(1)
    其中,n是指观测数据的个数。

    通过搜索一组确定的\mathbf{w^{*}},\mathbf{b^{*}},来使得均方误差最小,相当于求一个最小化问题,如下所示:
    \arg\min_{w,b} \frac{1}{n}\sum_{i=1}^{n}(y_{i} - (w_{i}x_{i} + b))^{2} \;\;\;\;\;(2)

    2.2 优化方法

    使用梯度下降法来优化。
    因为负梯度方向是下降最快的方向(原因我忘了,最优化方法里学过)。
    通过对公式(1)求导,得到迭代公式如下:
    \begin{split} w^{'} &= w- \eta\frac{\partial L}{\partial w}\\ b^{'} &= b - \eta \frac{\partial L}{\partial b} \;\;\;\;\;(3) \end{split}

    2.3 线性模型实战

    假定已知真实模型为:
    y = 1.477 * x + 0.089 \;\;\;\;\;(4)

    2.3.1 采样数据

    为了模拟真实样本的观测误差,给模型添加误差自变量\epsilon(服从均值为0,方差为0.01的高斯分布)。通过随机采样100次,获得训练样本

    data = []
    for i in range(100):
        x = np.random.uniform(-10, 10)
        eps = np.random.normal(0, 0.01)
        y = 1.477 * x + 0.089 + eps
        data.append([x,y])
    data = np.array(data)
    print(data)
    

    看看数据啥样

    [[ -9.60021602 -14.0942134 ]
     [ -0.37232041  -0.46292594]
     [  2.63889539   3.98211472]
     [ -2.20995294  -3.1684925 ]
     [ -0.04546845   0.01545368]
     [  0.12634349   0.2684818 ]
     [ -7.60430995 -11.14302551]
     [ -8.49308202 -12.46678967]
     [ -0.66275991  -0.88510424]
     [  3.81511044   5.71470414]
     [ -3.90845005  -5.68882557]
     [  3.89991873   5.85766212]
     [ -3.04443718  -4.41259812]
     [ -9.45421482 -13.87017124]
     [ -7.64294932 -11.20542922]
     [  6.87074876  10.23594637]
     [  7.70975544  11.49358153]
     [  8.77654607  13.04775353]
     [  2.33436715   3.53505862]
     [  9.8611767   14.65773777]
     [  9.15552886  13.60432713]
     [  6.9884293   10.4193793 ]
     [  5.48104386   8.17575369]
     [  5.25800232   7.85917402]
     [ -5.66851096  -8.28304106]
     [ -6.02664144  -8.80209278]
     [  7.47792335  11.13963929]
     [ -2.16694491  -3.12553714]
     [  0.94759025   1.49706415]
     [  3.61814506   5.44155154]
     [ -4.19255218  -6.10946919]
     [  8.40233352  12.49780604]
     [  2.3394856    3.52594169]
     [ -7.95670992 -11.65781222]
     [ -4.43651236  -6.46428767]
     [ -2.40778135  -3.47043792]
     [  1.61352247   2.47239909]
     [ -3.2139131   -4.67600746]
     [  0.24342091   0.44741657]
     [  7.65956519  11.3957219 ]
     [ -5.38876103  -7.86917344]
     [ -0.11754912  -0.075521  ]
     [ -6.62637309  -9.69615214]
     [  3.81278393   5.70705208]
     [ -2.43880832  -3.51337828]
     [  2.76936795   4.18182553]
     [  8.30619718  12.35679355]
     [  4.4054764    6.58460358]
     [ -0.02704708   0.0570599 ]
     [ -7.67451939 -11.23781726]
     [ -6.58862003  -9.63824781]
     [  5.17661677   7.73576713]
     [  9.45833855  14.06221877]
     [  9.27977255  13.8017235 ]
     [  0.14489541   0.30111149]
     [ -4.47763542  -6.51811142]
     [ -5.22072928  -7.63148031]
     [ -4.13325398  -6.00456797]
     [ -1.43202328  -2.01357256]
     [  0.67086219   1.10214829]
     [ -9.95138599 -14.58170367]
     [ -3.76888222  -5.49762671]
     [  3.02976856   4.56590306]
     [  9.02150487  13.41138176]
     [ -0.97422585  -1.36951624]
     [ -7.38654031 -10.80180177]
     [  0.59919152   0.97292905]
     [  2.42907512   3.68452982]
     [ -4.34011924  -6.29745307]
     [ -4.12638217  -6.00109115]
     [ -6.059121    -8.86941249]
     [  6.86453278  10.23190486]
     [  5.58938332   8.3482495 ]
     [ -5.03995397  -7.33498458]
     [  7.63521956  11.36450436]
     [ -7.06719594 -10.33646325]
     [  0.87190085   1.37831486]
     [ -0.79539418  -1.07751784]
     [ -1.68791378  -2.39635162]
     [ -6.89121246 -10.09747847]
     [ -4.96998712  -7.27386395]
     [  9.91577257  14.7167431 ]
     [  3.76105909   5.64608384]
     [  5.95351563   8.88105529]
     [  8.78018603  13.05369905]
     [ -9.63152758 -14.14743805]
     [  8.55143821  12.72273024]
     [  9.9249124   14.74658578]
     [ -1.47758889  -2.08876147]
     [ -4.76012657  -6.94655042]
     [  5.47545608   8.17698337]
     [ -1.11637075  -1.56822168]
     [  9.70595979  14.41497375]
     [  3.41749449   5.14023928]
     [ -0.1841253   -0.17868807]
     [  7.55375949  11.23450473]
     [ -0.95492939  -1.31760662]
     [ -7.5403792  -11.04225878]
     [ -3.35376736  -4.87748193]
     [ -0.59707783  -0.78675649]]
    

    2.3.2 计算误差

    均方误差的计算

    def mse(b, w, points):
        totalerror = 0
        for i in range(len(points)):
            x = points[i,0]
            y = points[i,1]
            totalerror += (y - (w * x + b))**2
        return totalerror / len(points)
    

    2.3.3 计算梯度和梯度更新

    梯度计算如下:
    \begin{split} \frac{\partial L}{\partial w} &= \frac{2}{n}\sum_{i=1}^{n}(wx_{i}+b)x_{i}-x_{i}y_{i}=\frac{2}{n}\sum_{i=1}^{n}x_{i}(wx_{i}+b - y_{i})\\ \frac{\partial L}{\partial b} &= \frac{2}{n}\sum_{i=1}^{n}wx_{i} + b - y_{i}\;\;\;\;\;(5) \end{split}
    根据公式(5)写出代码:

    def step_gradient(b_current, w_current, points, lr):
        b_grident = 0
        w_grident = 0
        for i in range(len(points)):
            w_grident += 2 * points[i,0] * ((w_current * points[i,0] + b_current) 
                             -  points[i,1]) / len(points)
            b_grident += 2 * (w_current * points[i,0] + b_current - points[i,1]) / len(points)
        new_w = w_current - lr * w_grident
        new_b = b_current - lr * b_grident
        return [new_w, new_b]
    
    def grident_descent(points, starting_b, starting_w, lr, epochs):
        b = starting_b
        w = starting_w
        for i in range(epochs):
            w, b = step_gradient(b_current=b,w_current=w,lr=lr,points=np.array(points))
            loss_mse = mse(b=b,w=w,points=points)
            if i%50 == 0:
                print(f"iteration:{i}, loss:{loss_mse}, w:{w}, b:{b}")
        return [w,b]
    

    主函数

    def main():
        lr = 0.01
        initial_w = 0
        initial_b = 0
        w,b=grident_descent(data,initial_b,initial_w,lr,1000)
        loss = mse(b,w,data)
        print(f'Final loss:{loss}, w:{w}, b:{b}')
    

    运行看看结果:

    iteration:0, loss:8.506605903378881, w:0.9682144436357752, b:0.01135859536407897
    iteration:50, loss:0.0007912403463238857, w:1.4770993778407528, b:0.06266569076842714
    iteration:100, loss:0.00018243254806051259, w:1.4769274409497026, b:0.07951401064065969
    iteration:150, loss:0.00010114583470035741, w:1.4768646150454148, b:0.08567040282174254
    iteration:200, loss:9.029260682163581e-05, w:1.4768416583999533, b:0.08791995440758126
    iteration:250, loss:8.884350707927863e-05, w:1.4768332700197653, b:0.08874194270665337
    iteration:300, loss:8.865002638516676e-05, w:1.4768302048976287, b:0.08904229801660268
    iteration:350, loss:8.862419325725976e-05, w:1.476829084899071, b:0.08915204813386905
    iteration:400, loss:8.862074407334759e-05, w:1.4768286756505278, b:0.08919215093159227
    iteration:450, loss:8.862028354569901e-05, w:1.476828526110719, b:0.08920680453220137
    iteration:500, loss:8.862022205703457e-05, w:1.47682847146873, b:0.08921215897186645
    iteration:550, loss:8.86202138471997e-05, w:1.4768284515024952, b:0.08921411548923665
    iteration:600, loss:8.862021275103868e-05, w:1.4768284442068138, b:0.08921483040255332
    iteration:650, loss:8.862021260468194e-05, w:1.4768284415409647, b:0.08921509163256236
    iteration:700, loss:8.862021258514166e-05, w:1.476828440566861, b:0.08921518708625732
    iteration:750, loss:8.862021258253238e-05, w:1.4768284402109226, b:0.0892152219651287
    iteration:800, loss:8.862021258218244e-05, w:1.4768284400808622, b:0.08921523470990159
    iteration:850, loss:8.862021258213605e-05, w:1.4768284400333382, b:0.08921523936685373
    iteration:900, loss:8.862021258213079e-05, w:1.4768284400159728, b:0.08921524106850857
    iteration:950, loss:8.862021258212945e-05, w:1.4768284400096274, b:0.08921524169029488
    Final loss:8.862021258213061e-05, w:1.4768284400073362, b:0.08921524191483535
    

    参考资料:https://github.com/dragen1860/Deep-Learning-with-TensorFlow-book

    相关文章

      网友评论

          本文标题:第二章 回归问题 笔记

          本文链接:https://www.haomeiwen.com/subject/bksfgctx.html