美文网首页
深度学习讲稿(14)

深度学习讲稿(14)

作者: 山岳之心 | 来源:发表于2021-02-08 02:53 被阅读0次

    3.8 多个输入:使用Numpy

    在Python中有一个操作向量十分方便的包:Numpy。我们使用Numpy来修改我们的代码,使得很多向量操作都可以直接调用Numpy包中的向量运算函数来实现。

    import numpy as np
    weights = np.array([0.05,0.2,0.15,0.3])
    def neural_network(input,weights):
        return input.dot(weights)
    
    # 数据集
    coin = np.array([4,6,7,3,9])
    stock = np.array([2.1,-1.3,3.05,-0.5,0.88])
    industry = np.array([1.0,0.2,2.4,-1.3,-0.22])
    shape = np.array([2,-0.5,3,-0.9,-0.2])
    
    # 输入神经网络
    input = np.array([coin[0],stock[0],industry[0],shape[0]])
    pred = neural_network(input,weights)
    print(pred)
    

    3.9 预测多个输出

    神经网络也可以只用一个输入做出多个预测。比如小明看到今天的某只股票涨停了,那么他有一个神经网络预测系统来预测决定该涨停的因素的权重是什么。下面就是这个神经网络的示意图和代码。


    多指标预测股价.jpg
    import numpy as np
    weights = np.array([0.1,0.6,0.3])
    def neural_network(input,weights):
        return list(input*weights)
    input = 1
    pred = neural_network(input,weights)
    

    3.10 使用多个输入和输出进行预测

    上面讨论的两种神经网络可以互相嵌合形成多输入,多输出的神经网络

    我们来考虑一个例子。小明发明了四个指标用来预测第二天股价的上涨和下跌,根据股价变化的比例可以区分为大涨,小涨,小跌,大跌。请帮助小明设计这个四指标预测的神经网络。

    这个神经网络的拓扑如下所示:
    [图片上传失败...(image-d66634-1612864081272)]
    空白神经网络的代码可以写成(稍后我们用numpy改写):

    weights = 
    [[0.1,0.1,0.6,0.2],
    [0.3,0.4,0.2,0.1],
    [0.35,0.45,0.1,0.05],
    [0.05,0.4,0.45,0.1]]
    
    def neural_network(input, weight):
        pred = vect_mat_mul(input,weight)
        return pred
    

    第二步:输入数据

    index1 = [0.1,0.4,0.3,0.6,0.2]
    index2 = [0.4,0.2,0.3,0.1,0.2]
    index3 = [0.3,0.3,0.2,0.2,0.3]
    index4 = [0.2,0.1,0.2,0.1,0.3]
    
    input = [index1[0],index2[0],index3[0],index4[0]]
    
    pred = neural_network(input, weights)
    

    第三步:补全函数vect_mat_mul

    def w_sum(a,b):
        assert(len(a)==len(b))
        output = 0 
        for i in range(len(a)):
            output += a[i]*b[i]
        return output
    
    def vect_mat_mul(vect,matrix):
        assert(len(vect)==len(matrix))
        output = [0,0,0,0]
        for i in range(len(vect)):
            output[i] = w_sum(vect,matrix[i])
        return output
    

    因此,整个神经网络的代码就是:

    # 内积函数
    def w_sum(a,b):
        assert(len(a)==len(b))
        output = 0 
        for i in range(len(a)):
            output += a[i]*b[i]
        return output
    
    # 矢量-矩阵乘法
    def vect_mat_mul(vect,matrix):
        assert(len(vect)==len(matrix))
        output = [0,0,0,0]
        for i in range(len(vect)):
            output[i] = w_sum(vect,matrix[i])
        return output
    
    # 空白神经网络
    def neural_network(input, weight):
        pred = vect_mat_mul(input,weight)
        return pred
    
    #----------------权重-------------#
    weights = [[0.1,0.1,0.6,0.2],
    [0.3,0.4,0.2,0.1],
    [0.35,0.45,0.1,0.05],
    [0.05,0.4,0.45,0.1]]
    #-------------加载数据集-----------#
    index1 = [0.1,0.4,0.3,0.6,0.2]
    index2 = [0.4,0.2,0.3,0.1,0.2]
    index3 = [0.3,0.3,0.2,0.2,0.3]
    index4 = [0.2,0.1,0.2,0.1,0.3]
    
    #-----------------输入-------------#
    input = [index1[0],index2[0],index3[0],index4[0]]
    pred = neural_network(input,weights)
    print(pred)
    
    

    如果说,双缝干涉实验打开了物理世界最重要的理论:量子力学的大门。那么我们刚刚写下的单层神经网络就是深度学习领域里的“双缝干涉实验”。实际上,不仅仅意义类似,这两者的计算方法都极其相似。

    正如我们可以在双缝实验中插入更多的缝,更多的板一样,神经网络本身也是可以堆叠的。比如我们可以考虑如下图的双 层堆叠神经网络(实际上只有隐藏层1和2是神经网络,输入层和输出层不能算层数)下面的是Graphviz(Dot)的绘图代码,绘图结果在后面:

    
    digraph G {
            rankdir=LR
            splines=line
            nodesep=.3;
            ranksep = 0.9;        
            node [fontname="Inconsolata, Consolas", fontsize=13, penwidth=0.5];
            
            subgraph cluster_0 {
            color=white;
                    node [style=solid,color=blue4, shape=circle];
            x2 x3 x1;
            label = "输入层";
        }
    
        subgraph cluster_1 {
            color=white;
            node [style=solid,color=red2, shape=circle];
            a42 a52 a12 a22 a32;
            label = "隐藏层1";
        }
    
        subgraph cluster_2 {
            color=white;
            node [style=solid,color=red2, shape=circle];
            a43 a53 a13 a23 a33;
            label = "隐藏层2";
        }
    
        subgraph cluster_3 {
            color=white;
            node [style=solid,color=seagreen2, shape=circle];
            O1 O2 O3 O4;
            label="输出层";
        }
    
            x1 -> a12 [arrowhead=vee]
            x1 -> a22 [arrowhead=vee]
            x1 -> a32 [arrowhead=vee]
            x1 -> a42 [arrowhead=vee]
            x1 -> a52 [arrowhead=vee]
    
            x2 -> a12 [arrowhead=vee]
            x2 -> a22 [arrowhead=vee]
            x2 -> a32 [arrowhead=vee]
            x2 -> a42 [arrowhead=vee]
            x2 -> a52 [arrowhead=vee]
     
            x3 -> a12 [arrowhead=vee]
            x3 -> a22 [arrowhead=vee]
            x3 -> a32 [arrowhead=vee]
            x3 -> a42 [arrowhead=vee]
            x3 -> a52 [arrowhead=vee]
    
            a12 -> a13 [arrowhead=vee]
            a22 -> a13 [arrowhead=vee]
            a32 -> a13 [arrowhead=vee]
            a42 -> a13 [arrowhead=vee]
            a52 -> a13 [arrowhead=vee]
    
            a12 -> a23 [arrowhead=vee]
            a22 -> a23 [arrowhead=vee]
            a32 -> a23 [arrowhead=vee]
            a42 -> a23 [arrowhead=vee]
            a52 -> a23 [arrowhead=vee]
    
            a12 -> a33 [arrowhead=vee]
            a22 -> a33 [arrowhead=vee]
            a32 -> a33 [arrowhead=vee]
            a42 -> a33 [arrowhead=vee]
            a52 -> a33 [arrowhead=vee]
    
            a12 -> a43 [arrowhead=vee]
            a22 -> a43 [arrowhead=vee]
            a32 -> a43 [arrowhead=vee]
            a42 -> a43 [arrowhead=vee]
            a52 -> a43 [arrowhead=vee]
    
            a12 -> a53 [arrowhead=vee]
            a22 -> a53 [arrowhead=vee]
            a32 -> a53 [arrowhead=vee]
            a42 -> a53 [arrowhead=vee]
            a52 -> a53 [arrowhead=vee]
    
            a13 -> O1 [arrowhead=vee]
            a23 -> O1 [arrowhead=vee]
            a33 -> O1 [arrowhead=vee]
            a43 -> O1 [arrowhead=vee]
            a53 -> O1 [arrowhead=vee]
    
            a13 -> O2 [arrowhead=vee]
            a23 -> O2 [arrowhead=vee]
            a33 -> O2 [arrowhead=vee]
            a43 -> O2 [arrowhead=vee]
            a53 -> O2 [arrowhead=vee]
    
            a13 -> O3 [arrowhead=vee]
            a23 -> O3 [arrowhead=vee]
            a33 -> O3 [arrowhead=vee]
            a43 -> O3 [arrowhead=vee]
            a53 -> O3 [arrowhead=vee]
    
            a13 -> O4 [arrowhead=vee]
            a23 -> O4 [arrowhead=vee]
            a33 -> O4 [arrowhead=vee]
            a43 -> O4 [arrowhead=vee]
            a53 -> O4 [arrowhead=vee]
    }
    
    双层神经网络.jpg

    上面这个神经网络中,存在着很多的映射,每一个映射都会带来至少一个参数。在深度学习领域,映射一般被叫做是激活函数。当所有的映射都是线性映射时,映射与参数之间存在一一对应。在这种情形下,线性参数的个数是4*5+6*5+6*4 = 74(请思考为什么不是60个参数?提示:需要考虑截距即常数项的贡献)。更复杂的情形下存在多项式映射,参数可以更多。

    相关文章

      网友评论

          本文标题:深度学习讲稿(14)

          本文链接:https://www.haomeiwen.com/subject/rhsmtltx.html