-
Regularization: Model should be "simple", so it works on test data
-
L1,L2 regularization, Elastic net(L1 + L2), Max norm regularization, Dropout, Batch normalization, stochastic depth
-
use Numerical gradient to debug your Analytic gradient
-
SGD(Stochastic Gradient Descent): using minibatch instead of the entire data set
网友评论