美文网首页
Machine Learning Week 5-Neural N

Machine Learning Week 5-Neural N

作者: MWhite | 来源:发表于2017-11-09 23:27 被阅读0次
    17/11/09 MWhite's learning notes

    1. Neural Networks-Theta

    1.1 Cost Function

    • L = total number of layers in the network
    • sl = number of units (not counting bias unit) in layer l
    • K = number of output units/classes


    The number of columns in our current theta matrix is equal to the number of nodes in our current layer (including the bias unit). The number of rows in our current theta matrix is equal to the number of nodes in the next layer (excluding the bias unit).

    1.2 Backpropagation Algorithm


    Computer δ:



    We want to minimize the cost function, so we need to figure out the partial derivative, and that is:


    2. Skills

    2.1 Unrolling Parameters

    thetaVector = [ Theta1(:); Theta2(:); Theta3(:); ]
    deltaVector = [ D1(:); D2(:); D3(:) ]
    
    Theta1 = reshape(thetaVector(1:110),10,11)
    Theta2 = reshape(thetaVector(111:220),10,11)
    Theta3 = reshape(thetaVector(221:231),1,11)
    

    2.2 Gradient Checking


    epsilon = 1e-4;
    for i = 1:n,
      thetaPlus = theta;
      thetaPlus(i) += epsilon;
      thetaMinus = theta;
      thetaMinus(i) -= epsilon;
      gradApprox(i) = (J(thetaPlus) - J(thetaMinus))/(2*epsilon)
    end;
    

    Once you have verified once that your backpropagation algorithm is correct, you don't need to compute gradApprox again. The code to compute gradApprox can be very slow.

    2.3 Random Initialization

    If the dimensions of Theta1 is 10x11, Theta2 is 10x11 and Theta3 is 1x11.
    
    Theta1 = rand(10,11) * (2 * INIT_EPSILON) - INIT_EPSILON;
    Theta2 = rand(10,11) * (2 * INIT_EPSILON) - INIT_EPSILON;
    Theta3 = rand(1,11) * (2 * INIT_EPSILON) - INIT_EPSILON;
    

    summary

    • Number of input units = dimension of features x(i)
    • Number of output units = number of classes
    • Number of hidden units per layer = usually more the better (must balance with cost of computation as it increases with more hidden units)
    • Defaults: 1 hidden layer. If you have more than 1 hidden layer, then it is recommended that you have the same number of units in every hidden layer.

    Training a Neural Network

    1. Randomly initialize the weights
      for i=1:m
    2. Implement forward propagation to get hΘ(x(i)) for any x(i)
    3. Implement the cost function
    4. Implement backpropagation to compute partial derivatives
      end;
    5. Use gradient checking to confirm that your backpropagation works. Then disable gradient checking.
    6. Use gradient descent or a built-in optimization function to minimize the cost function with the weights in theta.
    for i = 1:m,
       Perform forward propagation and backpropagation using example (x(i),y(i))
       (Get activations a(l) and delta terms d(l) for l = 2,...,L
    

    相关文章

      网友评论

          本文标题:Machine Learning Week 5-Neural N

          本文链接:https://www.haomeiwen.com/subject/ushvmxtx.html