Written on 2016/08/31
Application:
This simple approach complements the recent success of **residual network ** to reduce training time and improve the test error.
Challenge:
- Very deep models become worse at function approximation (called ** degradation ** problem) is not caused by overfitting, but caused by training signals vanishing.
- Effective and efficient training methods for very deep models need to be found.
Problem:
Motivated by ** ResNets ** which simplifies ** Highway Networks **, authors proposed a method new called Stochastic Depth to go a step further to reduce ResNet's test error and training time.
Solution:
- Shrink the depth of a network during training, while keeping it unchanged during testing.
- By a survival probability, randomly dropping entire ResBlocks during training and by bypassing their transformations through skip connections.
- Survival probabilities can adopt uniform distribution or linear decay (better)
Insights:
- This method(Stochastic depth) is designed for ResNet. Therefore, other networks without ResBlocks is not compatible with this method.
- This method can be regarded as an implicit model ensemble.
- A new more competitive method has been proposed (http://arxiv.org/pdf/1603.05027.pdf), which can be employed on deeper model and acquire lower test error.
One word to summarize:
This paper proposes a deep network with stochastic depth, a procedure to train very deep neural networks effectively and efficiently.
网友评论