美文网首页
Matlab编程思想的一点总结

Matlab编程思想的一点总结

作者: DingHaoqing | 来源:发表于2016-06-21 00:03 被阅读0次

    Matlab编程思想的一点总结

    矢量化编程

    基本思路:

    正向思路和逆向思路相结合,矢量化编程,分块

    编程步骤

    1.程序原理,并将数学公式变为矩阵运算形式

    1. 整理程序框架步骤,分步进行;
    2. 每一步编写改步程序时用逆向思路,从目标到每一块,分块求解
    3. 该过程注意矢量化编程:矩阵运算,注意矩阵的维数,能用矩阵不用向量运算
    4. 可以写出算法的伪代码
    5. 将伪代码转为matalb语言,注意用最少的语言实现,期间要注意算法的时间复杂度和占用内存的大小

    举例:稀疏性编码中求解cost函数和梯度函数

    本例基于 UFLDL教程中的稀疏性编码exercise1(练习1)

    1. 程序原理,即与cost函数和梯度函数相关的公式
    2. 程序步骤:
      cost函数由三部分组成: 均方差项,权重衰减项,稀疏性惩罚项
      梯度函数主要包括 w1, w2 ,b1, b2的梯度函数求解
    3. 分块进行(依据矢量化之后的数学公式):
    • 均方差项:
      1. 求解:$ h_{w,b}$ (f激活函数)
      2. 第三层激活值的计算 ( w2,w1,b1,b2)
      3. 第二层激活值的计算
    • 权重衰减项:
      w2,w1的计算,比较简单
    • 稀疏性惩罚项
      1. rho--平均活跃度的计算
      2. 第二层激活值
      3. 惩罚因子计算
    • 梯度函数计算
      1. 以w1的梯度为例:cost函数对w1的梯度由两项构成,第一项与残差有关,第二项与lamda有关
      2. 残差计算和稀疏性惩罚梯度计算
    1. 算法的伪代码,这一步的目的是为了理清matlab程序的思路,对于比较复杂的程序算法,可以写出伪代码,有利于编程,简单的可以略去这步
    2. 将伪代码编程matlab程序

    matalb代码

    function [cost,grad] = sparseAutoencoderCost(theta, visibleSize, hiddenSize, ...                                           
    lambda, sparsityParam, beta, data)
    % visibleSize: the number of input units (probably 64) 
    % hiddenSize: the number of hidden units (probably 25) 
    % lambda: weight decay parameter
    % sparsityParam: The desired average activation for the hidden units (denoted in the lecture
    %                           notes by the greek alphabet rho, which looks like a lower-case "p").
    % beta: weight of sparsity penalty term
    % data: Our 64x10000 matrix containing the training data.  So, data(:,i) is the i-th training example. 
      
    % The input theta is a vector (because minFunc expects the parameters to be a vector). 
    % We first convert theta to the (W1, W2, b1, b2) matrix/vector format, so that this 
    % follows the notation convention of the lecture notes. 
    
    W1 = reshape(theta(1:hiddenSize*visibleSize), hiddenSize, visibleSize); % row hindden and column visible
    W2 = reshape(theta(hiddenSize*visibleSize+1:2*hiddenSize*visibleSize), visibleSize, hiddenSize);
    b1 = theta(2*hiddenSize*visibleSize+1:2*hiddenSize*visibleSize+hiddenSize);
    b2 = theta(2*hiddenSize*visibleSize+hiddenSize+1:end);
    
    % Cost and gradient variables (your code needs to compute these values). 
    % Here, we initialize them to zeros. 
    cost = 0;
    W1grad = zeros(size(W1));    % according to each floor to compute the gradient
    W2grad = zeros(size(W2));
    b1grad = zeros(size(b1)); 
    b2grad = zeros(size(b2));
    
    %% ---------- YOUR CODE HERE --------------------------------------
    %  Instructions: Compute the cost/optimization objective J_sparse(W,b) for the Sparse Autoencoder,
    %                and the corresponding gradients W1grad, W2grad, b1grad, b2grad.
    %
    % W1grad, W2grad, b1grad and b2grad should be computed using backpropagation.
    % Note that W1grad has the same dimensions as W1, b1grad has the same dimensions
    % as b1, etc.  Your code should set W1grad to be the partial derivative of J_sparse(W,b) with
    % respect to W1.  I.e., W1grad(i,j) should be the partial derivative of J_sparse(W,b) 
    % with respect to the input parameter W1(i,j).  Thus, W1grad should be equal to the term 
    % [(1/m) \Delta W^{(1)} + \lambda W^{(1)}] in the last block of pseudo-code in Section 2.2 
    % of the lecture notes (and similarly for W2grad, b1grad, b2grad).
    % 
    % Stated differently, if we were using batch gradient descent to optimize the parameters,
    % the gradient descent update to W1 would be W1 := W1 - alpha * W1grad, and similarly for W2, b1, b2. 
    % 
    %1.forward propagation
    data_size=size(data);
    active_value2=repmat(b1,1,data_size(2));
    active_value3=repmat(b2,1,data_size(2));
    active_value2=sigmoid(W1*data+active_value2);
    active_value3=sigmoid(W2*active_value2+active_value3);
    %2.computing error term and cost
    ave_square=sum(sum((active_value3-data).^2)./2)/data_size(2);
    weight_decay=lambda/2*(sum(sum(W1.^2))+sum(sum(W2.^2)));
    
    p_real=sum(active_value2,2)./data_size(2);
    p_para=repmat(sparsityParam,hiddenSize,1);
    sparsity=beta.*sum(p_para.*log(p_para./p_real)+(1-p_para).*log((1-p_para)./(1-p_real)));
    cost=ave_square+weight_decay+sparsity;
    
    delta3=(active_value3-data).*(active_value3).*(1-active_value3);
    average_sparsity=repmat(sum(active_value2,2)./data_size(2),1,data_size(2));
    default_sparsity=repmat(sparsityParam,hiddenSize,data_size(2));
    sparsity_penalty=beta.*(-(default_sparsity./average_sparsity)+((1-default_sparsity)./(1-average_sparsity)));
    delta2=(W2'*delta3+sparsity_penalty).*((active_value2).*(1-active_value2));
    %3.backword propagation
    W2grad=delta3*active_value2'./data_size(2)+lambda.*W2;
    W1grad=delta2*data'./data_size(2)+lambda.*W1;
    b2grad=sum(delta3,2)./data_size(2);
    b1grad=sum(delta2,2)./data_size(2);
    
    %-------------------------------------------------------------------
    % After computing the cost and gradient, we will convert the gradients back
    % to a vector format (suitable for minFunc).  Specifically, we will unroll
    % your gradient matrices into a vector.
    
    grad = [W1grad(:) ; W2grad(:) ; b1grad(:) ; b2grad(:)];
    
    end
    
    %-------------------------------------------------------------------
    % Here's an implementation of the sigmoid function, which you may find useful
    % in your computation of the costs and the gradients.  This inputs a (row or
    % column) vector (say (z1, z2, z3)) and returns (f(z1), f(z2), f(z3)). 
    
    function sigm = sigmoid(x)
      
        sigm = 1 ./ (1 + exp(-x));
    end
    
    

    相关文章

      网友评论

          本文标题:Matlab编程思想的一点总结

          本文链接:https://www.haomeiwen.com/subject/egtgdttx.html