2.1 神经元模型
每个神经元通过树突获取输入信号,通过轴突传递输出信号,大量的神经元相互连接构建了巨大的神经网络,从而形成了人脑的感知和意思基础。
通过仿照生物神经元,心理学家 Warren McCulloch 和数理逻辑学家 Walter Pitters 于1943年提出了模拟生物神经元机制的人工神经网络的数学模型。这一成果后来被美国神经学家 Frank Rosenblatta 进一步发展成感知机模型,这也是现代深度学习的基础。
假设神经元输入向量,想通过线性变换得到输出,即 。
但是由于得到观测数据也就是向量总是会有误差的,不能用一个确定的直线(确定的)穿过所有观测点。
所以需要用一个好的参数,尽可能的穿过所有观测数据,使得与的差别尽可能的小(误差尽可能的小)。
使用均方误差来衡量回归问题的误差,如下公式所示:
其中,是指观测数据的个数。
通过搜索一组确定的,来使得均方误差最小,相当于求一个最小化问题,如下所示:
2.2 优化方法
使用梯度下降法来优化。
因为负梯度方向是下降最快的方向(原因我忘了,最优化方法里学过)。
通过对公式(1)求导,得到迭代公式如下:
2.3 线性模型实战
假定已知真实模型为:
2.3.1 采样数据
为了模拟真实样本的观测误差,给模型添加误差自变量(服从均值为0,方差为0.01的高斯分布)。通过随机采样100次,获得训练样本
data = []
for i in range(100):
x = np.random.uniform(-10, 10)
eps = np.random.normal(0, 0.01)
y = 1.477 * x + 0.089 + eps
data.append([x,y])
data = np.array(data)
print(data)
看看数据啥样
[[ -9.60021602 -14.0942134 ]
[ -0.37232041 -0.46292594]
[ 2.63889539 3.98211472]
[ -2.20995294 -3.1684925 ]
[ -0.04546845 0.01545368]
[ 0.12634349 0.2684818 ]
[ -7.60430995 -11.14302551]
[ -8.49308202 -12.46678967]
[ -0.66275991 -0.88510424]
[ 3.81511044 5.71470414]
[ -3.90845005 -5.68882557]
[ 3.89991873 5.85766212]
[ -3.04443718 -4.41259812]
[ -9.45421482 -13.87017124]
[ -7.64294932 -11.20542922]
[ 6.87074876 10.23594637]
[ 7.70975544 11.49358153]
[ 8.77654607 13.04775353]
[ 2.33436715 3.53505862]
[ 9.8611767 14.65773777]
[ 9.15552886 13.60432713]
[ 6.9884293 10.4193793 ]
[ 5.48104386 8.17575369]
[ 5.25800232 7.85917402]
[ -5.66851096 -8.28304106]
[ -6.02664144 -8.80209278]
[ 7.47792335 11.13963929]
[ -2.16694491 -3.12553714]
[ 0.94759025 1.49706415]
[ 3.61814506 5.44155154]
[ -4.19255218 -6.10946919]
[ 8.40233352 12.49780604]
[ 2.3394856 3.52594169]
[ -7.95670992 -11.65781222]
[ -4.43651236 -6.46428767]
[ -2.40778135 -3.47043792]
[ 1.61352247 2.47239909]
[ -3.2139131 -4.67600746]
[ 0.24342091 0.44741657]
[ 7.65956519 11.3957219 ]
[ -5.38876103 -7.86917344]
[ -0.11754912 -0.075521 ]
[ -6.62637309 -9.69615214]
[ 3.81278393 5.70705208]
[ -2.43880832 -3.51337828]
[ 2.76936795 4.18182553]
[ 8.30619718 12.35679355]
[ 4.4054764 6.58460358]
[ -0.02704708 0.0570599 ]
[ -7.67451939 -11.23781726]
[ -6.58862003 -9.63824781]
[ 5.17661677 7.73576713]
[ 9.45833855 14.06221877]
[ 9.27977255 13.8017235 ]
[ 0.14489541 0.30111149]
[ -4.47763542 -6.51811142]
[ -5.22072928 -7.63148031]
[ -4.13325398 -6.00456797]
[ -1.43202328 -2.01357256]
[ 0.67086219 1.10214829]
[ -9.95138599 -14.58170367]
[ -3.76888222 -5.49762671]
[ 3.02976856 4.56590306]
[ 9.02150487 13.41138176]
[ -0.97422585 -1.36951624]
[ -7.38654031 -10.80180177]
[ 0.59919152 0.97292905]
[ 2.42907512 3.68452982]
[ -4.34011924 -6.29745307]
[ -4.12638217 -6.00109115]
[ -6.059121 -8.86941249]
[ 6.86453278 10.23190486]
[ 5.58938332 8.3482495 ]
[ -5.03995397 -7.33498458]
[ 7.63521956 11.36450436]
[ -7.06719594 -10.33646325]
[ 0.87190085 1.37831486]
[ -0.79539418 -1.07751784]
[ -1.68791378 -2.39635162]
[ -6.89121246 -10.09747847]
[ -4.96998712 -7.27386395]
[ 9.91577257 14.7167431 ]
[ 3.76105909 5.64608384]
[ 5.95351563 8.88105529]
[ 8.78018603 13.05369905]
[ -9.63152758 -14.14743805]
[ 8.55143821 12.72273024]
[ 9.9249124 14.74658578]
[ -1.47758889 -2.08876147]
[ -4.76012657 -6.94655042]
[ 5.47545608 8.17698337]
[ -1.11637075 -1.56822168]
[ 9.70595979 14.41497375]
[ 3.41749449 5.14023928]
[ -0.1841253 -0.17868807]
[ 7.55375949 11.23450473]
[ -0.95492939 -1.31760662]
[ -7.5403792 -11.04225878]
[ -3.35376736 -4.87748193]
[ -0.59707783 -0.78675649]]
2.3.2 计算误差
均方误差的计算
def mse(b, w, points):
totalerror = 0
for i in range(len(points)):
x = points[i,0]
y = points[i,1]
totalerror += (y - (w * x + b))**2
return totalerror / len(points)
2.3.3 计算梯度和梯度更新
梯度计算如下:
根据公式(5)写出代码:
def step_gradient(b_current, w_current, points, lr):
b_grident = 0
w_grident = 0
for i in range(len(points)):
w_grident += 2 * points[i,0] * ((w_current * points[i,0] + b_current)
- points[i,1]) / len(points)
b_grident += 2 * (w_current * points[i,0] + b_current - points[i,1]) / len(points)
new_w = w_current - lr * w_grident
new_b = b_current - lr * b_grident
return [new_w, new_b]
def grident_descent(points, starting_b, starting_w, lr, epochs):
b = starting_b
w = starting_w
for i in range(epochs):
w, b = step_gradient(b_current=b,w_current=w,lr=lr,points=np.array(points))
loss_mse = mse(b=b,w=w,points=points)
if i%50 == 0:
print(f"iteration:{i}, loss:{loss_mse}, w:{w}, b:{b}")
return [w,b]
主函数
def main():
lr = 0.01
initial_w = 0
initial_b = 0
w,b=grident_descent(data,initial_b,initial_w,lr,1000)
loss = mse(b,w,data)
print(f'Final loss:{loss}, w:{w}, b:{b}')
运行看看结果:
iteration:0, loss:8.506605903378881, w:0.9682144436357752, b:0.01135859536407897
iteration:50, loss:0.0007912403463238857, w:1.4770993778407528, b:0.06266569076842714
iteration:100, loss:0.00018243254806051259, w:1.4769274409497026, b:0.07951401064065969
iteration:150, loss:0.00010114583470035741, w:1.4768646150454148, b:0.08567040282174254
iteration:200, loss:9.029260682163581e-05, w:1.4768416583999533, b:0.08791995440758126
iteration:250, loss:8.884350707927863e-05, w:1.4768332700197653, b:0.08874194270665337
iteration:300, loss:8.865002638516676e-05, w:1.4768302048976287, b:0.08904229801660268
iteration:350, loss:8.862419325725976e-05, w:1.476829084899071, b:0.08915204813386905
iteration:400, loss:8.862074407334759e-05, w:1.4768286756505278, b:0.08919215093159227
iteration:450, loss:8.862028354569901e-05, w:1.476828526110719, b:0.08920680453220137
iteration:500, loss:8.862022205703457e-05, w:1.47682847146873, b:0.08921215897186645
iteration:550, loss:8.86202138471997e-05, w:1.4768284515024952, b:0.08921411548923665
iteration:600, loss:8.862021275103868e-05, w:1.4768284442068138, b:0.08921483040255332
iteration:650, loss:8.862021260468194e-05, w:1.4768284415409647, b:0.08921509163256236
iteration:700, loss:8.862021258514166e-05, w:1.476828440566861, b:0.08921518708625732
iteration:750, loss:8.862021258253238e-05, w:1.4768284402109226, b:0.0892152219651287
iteration:800, loss:8.862021258218244e-05, w:1.4768284400808622, b:0.08921523470990159
iteration:850, loss:8.862021258213605e-05, w:1.4768284400333382, b:0.08921523936685373
iteration:900, loss:8.862021258213079e-05, w:1.4768284400159728, b:0.08921524106850857
iteration:950, loss:8.862021258212945e-05, w:1.4768284400096274, b:0.08921524169029488
Final loss:8.862021258213061e-05, w:1.4768284400073362, b:0.08921524191483535
参考资料:https://github.com/dragen1860/Deep-Learning-with-TensorFlow-book
网友评论