美文网首页
Tensorflow | 踩坑纪实

Tensorflow | 踩坑纪实

作者: shawn233 | 来源:发表于2018-07-03 10:26 被阅读0次

1 计算损失函数时,logits参数是什么

损失函数的计算涉及到Tensorflow的两个常用函数:

tf.nn.sigmoid_cross_entropy_with_logits(labels, logits)
tf.nn.softmax_cross_entropy_with_logits_v2(labels, logits) 
#写这段文字时softmax_cross_entropy_with_logits被明确标为即将弃用的不建议函数

当我初次使用Tensorflow时,我天真的认为logits就是预测值(我一般称之为y_hat)。其实不然,参数logits为输出层的激活值(activation),与预测值的关系通常为:

y_hat = tf.nn.sigmoid (activation)
y_hat = tf.nn.softmax (activation)

2 用sigmoid作为输出层激活函数时,损失函数无法降为0

这很有可能时因为你的标签不是0或1,而是0到1间的一个数。根据交叉熵损失函数的定义,只有logit和label同时为0或同时为1时,其值为0。因此,在这labels不为0的情况下,损失函数的值是无法达到0的。


3 MNIST怎么使用


4 tf.nn.softmax_cross_entropy_with_logits_v2函数认为哪个维度是类别维度?

在一次跑mnist的实验中,我把类别维放在第一维(dim=0),并使用了tf.nn.softmax_cross_entropy_with_logits_v2计算熵,结果:

Epoch    1: cost=1611.621314344
Epoch    6: cost=17532.561093204
Epoch   11: cost=33408.859425080

可见,模型不收敛。几小时后,我发现了问题:tf.nn.softmax_cross_entropy_with_logits_v2默认类别维度为最后一维(dim=-1)。查阅文档:

tf.nn.softmax_cross_entropy_with_logits_v2(
    _sentinel=None,
    labels=None,
    logits=None,
    dim=-1,
    name=None
)

Args:

  • _sentinel: Used to prevent positional parameters. Internal, do not use.
  • labels: Each vector along the class dimension should hold a valid probability distribution e.g. for the case in which labels are of shape [batch_size, num_classes], each row of labels[i] must be a valid probability distribution.
  • logits: Unscaled log probabilities.
  • dim: The class dimension. Defaulted to -1 which is the last dimension.
  • name: A name for the operation (optional).

从文档中我们知道了,参数dim指定了类别的维度。因此在使用这个函数时,记得指定dim参数。添加dim=0后,实验结果:

单层Softmax
Epoch    1: cost=1.747955866
Epoch    6: cost=0.334705674
Epoch   11: cost=0.291128567
Epoch   16: cost=0.271443650
Epoch   21: cost=0.266406643
Epoch   26: cost=0.259507637
Epoch   31: cost=0.255616793
Epoch   36: cost=0.252413219
Epoch   41: cost=0.254052425
Epoch   46: cost=0.254678538
Opitimization Finished!

三层softmax
Epoch    1: cost=9.467425473
Epoch    6: cost=2.299335675
Epoch   11: cost=1.818574688
Epoch   16: cost=0.992218607
Epoch   21: cost=0.570668126
Epoch   26: cost=0.229845069
Epoch   31: cost=0.150805521
Epoch   36: cost=0.119380749
Epoch   41: cost=0.101064101
Epoch   46: cost=0.082242706
Opitimization Finished!

祝大家的模型永远收敛~


5 tf.argmax的作用是什么

官方文档:

tf.argmax(
    input,
    axis=None,
    name=None,
    dimension=None,
    output_type=tf.int64
)
  • Returns the index with the largest value across axes of a tensor. (deprecated arguments)

Args:

  • input: A Tensor. Must be one of the following types: float32, float64, int32, uint8, int16, int8, complex64, int64, qint8, quint8, qint32, bfloat16, uint16, complex128, half, uint32, uint64.
  • axis: A Tensor. Must be one of the following types: int32, int64. int32 or int64, must be in the range [-rank(input), rank(input)). Describes which axis of the input Tensor to reduce across. For vectors, use axis = 0.
  • output_type: An optional tf.DType from: tf.int32, tf.int64. Defaults to tf.int64.
  • name: A name for the operation (optional).

根据文档我们知道,tf.argmax返回张量在某个方向上最大值的下标,运算后会使原张量降维。第二个参数axis指定了进行这个运算的维度,也就是因运算而消减的那个维度。

相关文章

网友评论

      本文标题:Tensorflow | 踩坑纪实

      本文链接:https://www.haomeiwen.com/subject/fwlduftx.html