美文网首页
Gradient Descent

Gradient Descent

作者: 光华_5206 | 来源:发表于2018-08-16 06:14 被阅读0次

    Gradient descent

    cost function: for example, MSE(Mean Square Error) can be expressed as ​. To be more generally, ​ . Its gradient can be formulated as

    The calculation of gradient has to iterate all samples and sum them together. If the number of samples is very large, the calculation is very time-consuming.

    So, to overcome this problem we need to divide the data into smaller sizes and give it to our computer one by one and update the weights of the neural networks at the end of every step to fit it to the data given.

    • iterative

    • learning rate

    sgd.gif
    • Batch Gradient Descent

      iterate all samples at once

    • Stochastic Gradient Descent

      one sample at each iterate

    • Mini-Batch Gradient Descent

      balance between the Batch gradient descent and stochastic gradient descent. The samples are divided into several batches, and all of the batches comprise one epoch.

    相关文章

      网友评论

          本文标题:Gradient Descent

          本文链接:https://www.haomeiwen.com/subject/fzzgbftx.html