084936fmmnmvwlv33vwnm2.jpg 于是动手改模型,把如下网络中所有参数的数据类型从tf.float32
def net(image, training):
conv1 = relu(instance_norm(conv2d(image, 3, 32, 9, 1)))
conv2 = relu(instance_norm(conv2d(conv1, 32, 64, 3, 2)))
conv3 = relu(instance_norm(conv2d(conv2, 64, 128, 3, 2)))
res1 = residual(conv3, 128, 3, 1)
res2 = residual(res1, 128, 3, 1)
res3 = residual(res2, 128, 3, 1)
res4 = residual(res3, 128, 3, 1)
res5 = residual(res4, 128, 3, 1)
deconv1 = relu(instance_norm(resize_conv2d(res5, 128, 64, 3, 2, training)))
deconv2 = relu(instance_norm(resize_conv2d(deconv1, 64, 32, 3, 2, training)))
deconv3 = tf.nn.tanh(instance_norm(conv2d(deconv2, 32, 3, 9, 1)))
,是一种TensorFlow特有的数据类型,叫做截断浮点数(truncated 16-bit floating point),它是由一个float32截断前16位而成的。它和IEEE定义的float16不同,主要是用于取代float32进行神经网络训练,同时由于其表示范围和float32一致,可以避免float16的NaN问题。
def resize_conv2d(x, input_filters, output_filters, kernel, strides, training):
with tf.variable_scope('conv_transpose'):
height = x.get_shape()[1].value if training else tf.shape(x)[1]
width = x.get_shape()[2].value if training else tf.shape(x)[2]
new_height = height * strides * 2
new_width = width * strides * 2
x_resized = tf.image.resize_images(x, [new_height, new_width], tf.image.ResizeMethod.NEAREST_NEIGHBOR)
return conv2d(x_resized, input_filters, output_filters, kernel, strides)
InvalidArgumentError (see above for traceback): No OpKernel was registered to support Op 'Floor' with these attrs. Registered devices: [CPU,GPU], Registered kernels:
device='CPU'; T in [DT_DOUBLE]
device='CPU'; T in [DT_HALF]
device='CPU'; T in [DT_FLOAT]
device='GPU'; T in [DT_DOUBLE]
device='GPU'; T in [DT_HALF]
device='GPU'; T in [DT_FLOAT]
[[Node: model/model/dropout/Floor = Floor[T=DT_BFLOAT16, _device="/gpu:0"](model/model/dropout/add)]]
多次修改代码,无解,在Google上搜索找到了原因:bfloat16只支持Google自家的TPU(issue 21317),无奈,放弃。
support isn't complete for GPUs, as it's not supported natively by the devices.For performance you'll want to use float32 or float16 for GPU execution (though float16 can be difficult to train models with). TPUs support bfloat16 for effectively all operations (but you currently have to migrate your model to work on the TPU).
<tf.Variable 'conv1/conv/weight/Adam:0' shape=(9, 9, 3, 32) dtype=float32_ref>
<tf.Variable 'conv1/conv/weight/Adam_1:0' shape=(9, 9, 3, 32) dtype=float32_ref>
<tf.Variable 'conv2/conv/weight/Adam:0' shape=(3, 3, 32, 64) dtype=float32_ref>
<tf.Variable 'deconv2/conv_transpose/conv/weight/Adam_1:0' shape=(3, 3, 64, 32) dtype=float32_ref>