权重初始化

作者: HELLOTREE1 | 来源:发表于2019-04-02 16:16 被阅读0次

Gaussian
Weights are randomly drawn from Gaussian distributions with the fixed mean (e.g., 0) and fixed standard deviation (e.g., 0.01).

This is the most common initialization method in deep learning.

Xavier
This method proposes to adopt a properly scaled uniform or Gaussian distribution for initialization.

In Caffe (an open framework for deep learning) [2], It initializes the weights in the network by drawing them from a distribution with zero mean and a specific variance,

Where W is the initialization distribution for the neuron in question, and n_in is the number of neurons feeding into it. The distribution used is typically Gaussian or uniform.

In Glorot & Bengio’s paper [1], it originally recommended using

Where n_out is the number of neurons the result is fed to.

Reference:

[1] X. Glorot and Y. Bengio. Understanding the difficulty of training deep feedforward neural networks. In International Conference on Artificial Intelligence and Statistics, pages 249–256, 2010.

[2] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S.Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv:1408.5093, 2014.

MSRA

This method is proposed to solve the training of extremely deep rectified models directly from scratch [1].

In this method, weights are initialized with a zero-mean Gaussian distribution whose std is

Where is the spatial filter size in layer l and d_l−1 is the number of filters in layer l−1.

Reference:
[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, Technical report, arXiv, Feb. 2015

网友评论

本文标题：权重初始化

本文链接：https://www.haomeiwen.com/subject/ttvgbqtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

权重初始化

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读