美文网首页
2018-08-09 deep NN

2018-08-09 deep NN

作者: 镜中无我 | 来源:发表于2019-02-18 17:15 被阅读0次

preface:

deep learning :a collection of highly complicated data modeling algorithms achieved via multiple layers of nonlinear translation
in a sense ,deep learning equals to DNN

linear models are of great limitation

multi-layers equals to single layer

activation achieves non-linearization

tensorflow provides 7 activations
such as:relu,sigmoid,tanh and etc

multilayers to solve exclusive OR problems

typical point:conpound features extraction

loss function

  • cross entropy
    -tf.reduce_mean(y_*tf.log(y))
    how to turn the results of forward-probagation to the form of probability distribution?
    then: Softmax is introduced
    softmax.png
  • MSE(mean s)


    MSE.png
  • custom algorithms according to practice

optimizing algorithms

  • gradient descent
  • back-probagation
  • stochastic gradient descent
  • trade-off batch gradient descent

about learning rate

  • setup
    exponential decay
#tf.train.exponential_decay
#realization
decayed_learning _rate=learning_rate*decay_rate^(global_step/decay_step)
#----------use
learning_rate=tf.train.exponential_decay(0.1,global_step,100,0.96,staircase=True)
#staircase is true ,so multiply it with 0.96 every 100 steps,namely the function is stair-shaped

over-fitting

definition:memorize the random noise instead of learning the total trend
way to avoid: regularization
a metric depicting the complexity about coefficient

  • L1 one norm ,which will sparse the parameters(more zeros)
  • L2 two norm,which usually is differentiable so normal
  • exponentialMovingAverage
#it is a object 
# constructing para.
def __init__(self,decay,num_updates=None,zero_debias=False)
# args:decay for calculate the value of shadow variable i.e. object.average(variable),num_updates for updating decay
# usually the global_step
# global_step is the assistant variable and will add by 1 every training apoch
# member function apply() which is called to create a shadow variable with updated value
# object.apply(self,var_list=None)
# algorithm: decay=min{DECAY,(1.0+num_updates)/(10.0+num_updates)},DECAY is fixed
# object.average() for getting the value : shadow_variable = decay * shadow_variable + (1-decay) * variable
# control the distance between now and before and slower the change
# this method won’t change the para. but will influence the gradient descent via adjusting the result of forward-prob.

相关文章

网友评论

      本文标题:2018-08-09 deep NN

      本文链接:https://www.haomeiwen.com/subject/eigqbftx.html