美文网首页
Lecture 10 | Recurrent Neural Ne

Lecture 10 | Recurrent Neural Ne

作者: Ysgc | 来源:发表于2019-11-04 03:07 被阅读0次

    2014, before batch normalization was invented, training NN was hard.

    For example, VGG was trained for 11 layers first, and then randomly added more layers inside, so that it could converge.

    Another example: Google net used early output

    bad! only bp within a batch

    similar to mini batch

    learned to recite the GNU license

    675 Mass Ave -> central square ???

    not perfect

    soft attention -> weighted combination of all img location
    hard attention -> forcing the model select only one location to look at -> more tricky -> not differentiable -> talk later in RL lecture

    RNN typically not deep -> 2,3,4 layers usually

    相关文章

      网友评论

          本文标题:Lecture 10 | Recurrent Neural Ne

          本文链接:https://www.haomeiwen.com/subject/xckrbctx.html