L1W2神经网络入门编程题--深度学习笔记

作者: Sunooo | 来源:发表于2020-06-23 21:59 被阅读0次

L1W2神经网络入门编程题--深度学习笔记
65-R语言训练深度预测模型
TensorFlow 深度学习中文第二版（初稿）
2018-12-20VGG16相关资料
深度学习理论笔记（感知机）
深度学习入门（6）如何对神经网络模型训练结果进行评价
《神经网络与机器学习》笔记（一）
keras学习-nlp (1)
[机器学习入门] 李宏毅机器学习笔记-11（Convolutio
对话深度学习1：Neural Networks 神经网络

这是我做的第一道神经网络的编程题，使用python3.8，jupyter notebook。
这道题的目的就是神经网络思维的入门，小试牛刀。
在开始之前需要先准备测试数据文件和工具类
链接:https://pan.baidu.com/s/1uX9_MTnaHB8HHdHneOMz3Q 密码:8p3h

测试文件
datasets 是测试数据，lr_utils是解析图片数据的工具。
然后安装h5py和matplotlib库，numpy库如果没有也需要安装。
测试数据解包含209张图片（其中72张是猫，137张不是猫）的数据，以.h5作为尾缀，这种格式的数据，特意去搜了一下，是一种高效数据存储格式。https://www.hdfgroup.org/solutions/hdf5/

下面直接上代码

import numpy as np
import matplotlib.pyplot as plt
import h5py
from lr_utils import load_dataset


# 导入数据集中的训练数据和测试数据
train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()

# 对数据集进行降维处理，告诉reshape函数将数据分城209份，每份为1行，然后再用T函数转置，结果为209列，12288行。每列为一个样本
train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0], -1).T
test_set_x_flatten = test_set_x_orig.reshape(test_set_x_orig.shape[0], -1).T

# 对像素数据进行标准化处理，像素数据不会比255大，处理之后数据位于0-1之间
train_set_x = train_set_x_flatten / 255 
test_set_x = test_set_x_flatten / 255

#构造sigmod函数
def sigmoid(z):
    s = 1 / (1 + np.exp(-z))
    return s

#初始化参数 w b
def initialize_with_zeros(dim):
    w = np.zeros((dim,1))
    b = 0
    return (w, b)

#构造传播函数，用于计算损失函数和梯度
def propagate(w, b, X, Y):
    m = X.shape[1]
    z = np.dot(w.T, X) + b
    A = sigmoid(z)
    
    #计算成本
    cost = (-1 / m) * np.sum(Y * np.log(A) + (1 - Y) * (np.log(1 - A)))
    
    dw = (1 / m) * np.dot(X, (A - Y).T)
    db = (1 / m) * np.sum(A - Y)
    
    assert(dw.shape == w.shape)
    assert(db.dtype == float)
    cost = np.squeeze(cost)
    assert(cost.shape == ())
    
    grads = {
        "dw" : dw,
        "db" : db
    }
    return (grads, cost)

#构造优化函数，目标是通过最小化损失函数J来学习w和b，对于参数θ，更新规则是θ=θ-αdθ，其中α是学习率
def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost = False):
    
    costs = []
   
    for i in range(num_iterations):
        
        grads, cost = propagate(w, b, X, Y)
        dw = grads['dw']
        db = grads['db']
        
        # 从第二次循环开始，w b会减少
        w = w - learning_rate * dw
        b = b - learning_rate * db
    
        if i % 100 == 0:
            costs.append(cost)
            
        if print_cost and (i % 100 == 0):
            print("迭代的次数： %i， 误差值： %f" %(i, cost))
            
    params = {"w": w, "b": b}
    grads = {"dw": dw, "db": db}
    
    return params, grads, costs
    

def predict(w, b, X):
    
    
    m = X.shape[1]
    Y_prediction = np.zeros((1, m))
    w = w.reshape(X.shape[0], 1)
    
    A = sigmoid(np.dot(w.T, X) + b)
    
    
    for i in range(A.shape[1]):
        Y_prediction[0, i] = 0 if A[0, i] <= 0.5 else 1
        
        
    assert(Y_prediction.shape == (1, m))
    
    return Y_prediction

#把所有的函数整合到一个model中

def model(X_train, Y_train, X_test, Y_test, num_iterations = 2000, learning_rate = 0.5, print_cost = False):
    
    w, b = initialize_with_zeros(X_train.shape[0])
    
    parameters, grads, costs = optimize(w, b, X_train, Y_train, num_iterations, learning_rate, print_cost)
    
    w = parameters["w"]
    b = parameters["b"]
    
    Y_prediction_test = predict(w, b, X_test)
    Y_prediction_train = predict(w, b, X_train)
    
    print("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))
    print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))
                                       
    d = {"costs": costs,
        "Y_prediction_test": Y_prediction_test,
        "Y_prediction_train": Y_prediction_train,
        "w": w,
        "b": b,
        "learning_rate": learning_rate,
        "num_iterations": num_iterations }
                                       
    return d

#运行单元训练模型 
print("====================测试model====================") 
d = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 2000, learning_rate = 0.005, print_cost = True)  


#绘制图片
costs = np.squeeze(d["costs"])
plt.plot(costs)
plt.ylabel("cost")
plt.xlabel("iterations (per hundreds)")
plt.title("Learning rate = " + str(d["learning_rate"]))
plt.show()

输出结果

====================测试model====================
迭代的次数： 0， 误差值： 0.693147
迭代的次数： 100， 误差值： 0.584508
迭代的次数： 200， 误差值： 0.466949
迭代的次数： 300， 误差值： 0.376007
迭代的次数： 400， 误差值： 0.331463
迭代的次数： 500， 误差值： 0.303273
迭代的次数： 600， 误差值： 0.279880
迭代的次数： 700， 误差值： 0.260042
迭代的次数： 800， 误差值： 0.242941
迭代的次数： 900， 误差值： 0.228004
迭代的次数： 1000， 误差值： 0.214820
迭代的次数： 1100， 误差值： 0.203078
迭代的次数： 1200， 误差值： 0.192544
迭代的次数： 1300， 误差值： 0.183033
迭代的次数： 1400， 误差值： 0.174399
迭代的次数： 1500， 误差值： 0.166521
迭代的次数： 1600， 误差值： 0.159305
迭代的次数： 1700， 误差值： 0.152667
迭代的次数： 1800， 误差值： 0.146542
迭代的次数： 1900， 误差值： 0.140872
train accuracy: 99.04306220095694 %
test accuracy: 70.0 %

损失曲线

代码编写参考了以下两位博主的文章，在此感谢他们的无私奉献。
https://blog.csdn.net/u013733326/article/details/79639509
https://www.kesci.com/home/project/5dd23dbf00b0b900365ecef1

在编写代码途中遇到了很多bug和不懂的问题，现在记录一下。
1.为什么激活函数sigmoid按照0-0.5，0.5-1进行分类，而不是0-0.8，0.8-1进行分类？
因为这是入门的编程题，所以暂时取0.5运行计算，在实际应用中，需要根本样本的实际情况取值，出学不必深究。
2.为什么将0.5-1的数据认定是猫？
因为在损失函数中，预测的y值和实际的y值接近的时候，损失函数值最小，本次样本是猫的时候标签值为1，不是猫的标签值为0，随着损失函数越来越小，说明预测的y值更接近1，所以靠近1的一类认为是猫