cs231n

作者: ericsunn | 来源:发表于2018-02-06 19:50 被阅读0次

    http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture6.pdf

    sigmoid

    • Saturated neurons “kill” the gradients
    • Sigmoid outputs are not zero-centered
    • exp() is a bit compute expensive

    tanh

    • Squashes numbers to range [-1,1]
    • zero centered (nice)
    • Saturated neurons “kill” the gradients

    ReLU

    • Does not saturate (in +region)
    • Very computationally efficient
    • Converges much faster than sigmoid/tanh in practice (e.g. 6x)
    • Actually more biologically plausible than sigmoid

    http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture7.pdf

    • Adam is a good default choice in most cases
    • If you can afford to do full batch updates then try out L-BFGS (and don’t forget to disable all sources of noise)

    相关文章

      网友评论

          本文标题:cs231n

          本文链接:https://www.haomeiwen.com/subject/vymigxtx.html