cs231n

作者: ericsunn | 来源:发表于2018-02-06 19:50 被阅读0次

http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture6.pdf

sigmoid

  • Saturated neurons “kill” the gradients
  • Sigmoid outputs are not zero-centered
  • exp() is a bit compute expensive

tanh

  • Squashes numbers to range [-1,1]
  • zero centered (nice)
  • Saturated neurons “kill” the gradients

ReLU

  • Does not saturate (in +region)
  • Very computationally efficient
  • Converges much faster than sigmoid/tanh in practice (e.g. 6x)
  • Actually more biologically plausible than sigmoid

http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture7.pdf

  • Adam is a good default choice in most cases
  • If you can afford to do full batch updates then try out L-BFGS (and don’t forget to disable all sources of noise)

相关文章

网友评论

      本文标题:cs231n

      本文链接:https://www.haomeiwen.com/subject/vymigxtx.html