美文网首页
Sharp Minima Can Generalize For

Sharp Minima Can Generalize For

作者: catHeart | 来源:发表于2017-10-27 23:10 被阅读39次

    Sharp Minima Can Generalize For Deep Nets, https://arxiv.org/abs/1703.04933

    Conventionally, many researchers hold the view that the flatness of the minima in DNN contributes to its generalization ability. This paper argues that common measures fail to descripe the flatness of DNN.
    Three kinds of flatness measures are investigated, $\epsilon$-flatness, curvature and $\epsilon$-sharpness.

    Given $\epsilon > 0$, a minimum $\theta$, and a loss $L$, we define $C(L, \theta, \epsilon)$ as the largest (using inclusion as the partial order over the subsets of $\Theta$) connected set containing $\theta$ such that $\forall \theta' \in C(L, \theta, \epsilon), L(\theta') < L(\theta) + \epsilon$. The $\theta$-flatness will be defined as the volume of $C(L, \theta, \epsilon)$. We will call this measure the volume $\epsilon$-flatness.

    Let $B_2(\epsilon, \theta)$ be an Euclidean ball centered on a minimum $\theta$ with radius $\epsilon$. Then, for a non-negative valued loss function $L$, the $\epsilon$-sharpness will be defined as proportional to
    $\frac{\max_{\theta' \in B_2(\epsilon, \theta)}(L(\theta') - L(\theta))}{1+L(\theta)}$

    New terms

    • Lipschitz constant

    To be honest, I don't understand the detail in the paper. But it worth re-reading.

    相关文章

      网友评论

          本文标题:Sharp Minima Can Generalize For

          本文链接:https://www.haomeiwen.com/subject/rqpcpxtx.html