Recurrent Neural Networks
- networks with loops in them, allowing information to persist.
data:image/s3,"s3://crabby-images/70c7e/70c7ef6fefed0eb6d39ce28a0906d6ddc4eb0a40" alt=""
- unroll
data:image/s3,"s3://crabby-images/ad8ad/ad8ad0493b94fff32d3bb16969ad3f8691a8322b" alt=""
The Problem of Long-Term Dependencies
- use past to predict now
data:image/s3,"s3://crabby-images/1b0e4/1b0e46089b374dce83388e0dbf537843886fada4" alt=""
- The problem was explored in depth by Hochreiter (1991) [German] and Bengio, et al. (1994), who found some pretty fundamental reasons why it might be difficult.
-
这是普通的RNN结构
Paste_Image.png
-
这是LSTM结构
data:image/s3,"s3://crabby-images/c843d/c843d77ee6e86c7b42f2a5f1f0752d344f6ed253" alt=""
data:image/s3,"s3://crabby-images/ebf4e/ebf4e42ac3de270e4c7e23d1eb4d208e77a60d82" alt=""
The Core Idea Behind LSTMs
- 重要的点 cell state(传输带),能够在上面增加或者去除cell
data:image/s3,"s3://crabby-images/34f3d/34f3d0b641685d8676314c1824d8688823b3ae78" alt=""
- 门有让信息通过的能力。由sigmoid和乘法运算组成。
data:image/s3,"s3://crabby-images/e4bf9/e4bf97bc8f69fc852b2de6eaf14f54e3e1b93e36" alt=""
它决定有多少信息通过。0意味着不让任何信息通过,1意味着让所有信息通过。
Step-by-Step LSTM Walk Through
- “forget gate layer.”
data:image/s3,"s3://crabby-images/ef8cf/ef8cfb69e50434de0d6accea9a7630d5194f302a" alt=""
- decide what new information we’re going to store
data:image/s3,"s3://crabby-images/87488/874882765785f6094533757669760a4d6b2a3fd3" alt=""
- 我们决定那些要忘记,那些信息重要的要留下
data:image/s3,"s3://crabby-images/368c3/368c393c1194c9455f8903a336b4ccb7bce4e90d" alt=""
- 最后我们决定要输出什么(时态or词性)
data:image/s3,"s3://crabby-images/87f88/87f8859cccf8a87dfdb93574cc01de380df4938c" alt=""
LSTM变形
1.我们希望在忘记之前能够查看cell state的情况(peepholes)
data:image/s3,"s3://crabby-images/89d1f/89d1fe0487f9039d0074da1409cc671354b98d94" alt=""
2.当我们忘记旧的东西,我们才加入新的值
data:image/s3,"s3://crabby-images/28573/28573c2bb38d8e3186a138836cf8d171395f9611" alt=""
3.把forget gate 和 input gate 变成update gate。把cell state 和hidden state 结合在一起。
data:image/s3,"s3://crabby-images/141c2/141c29ef6422b8ea52a6d396d0b643207dc013a3" alt=""
网友评论