美文网首页
权重初始化

权重初始化

作者: HELLOTREE1 | 来源:发表于2019-04-02 16:16 被阅读0次

    原贴

    1. Gaussian
      Weights are randomly drawn from Gaussian distributions with the fixed mean (e.g., 0) and fixed standard deviation (e.g., 0.01).

    This is the most common initialization method in deep learning.

    1. Xavier
      This method proposes to adopt a properly scaled uniform or Gaussian distribution for initialization.

    In Caffe (an open framework for deep learning) [2], It initializes the weights in the network by drawing them from a distribution with zero mean and a specific variance,

    Where W is the initialization distribution for the neuron in question, and n_in is the number of neurons feeding into it. The distribution used is typically Gaussian or uniform.

    In Glorot & Bengio’s paper [1], it originally recommended using

    Where n_out is the number of neurons the result is fed to.

    Reference:

    [1] X. Glorot and Y. Bengio. Understanding the difficulty of training deep feedforward neural networks. In International Conference on Artificial Intelligence and Statistics, pages 249–256, 2010.

    [2] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S.Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv:1408.5093, 2014.

    1. MSRA

    This method is proposed to solve the training of extremely deep rectified models directly from scratch [1].

    In this method, weights are initialized with a zero-mean Gaussian distribution whose std is

    Where is the spatial filter size in layer l and d_l−1 is the number of filters in layer l−1.

    Reference:
    [1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, Technical report, arXiv, Feb. 2015

    相关文章

      网友评论

          本文标题:权重初始化

          本文链接:https://www.haomeiwen.com/subject/ttvgbqtx.html