lecture 3

作者: 听闻不见 | 来源:发表于2020-08-19 11:21 被阅读0次
    • Regularization: Model should be "simple", so it works on test data

    • L1,L2 regularization, Elastic net(L1 + L2), Max norm regularization, Dropout, Batch normalization, stochastic depth

    • use Numerical gradient to debug your Analytic gradient

    • SGD(Stochastic Gradient Descent): using minibatch instead of the entire data set

    相关文章

      网友评论

          本文标题:lecture 3

          本文链接:https://www.haomeiwen.com/subject/vuqljktx.html