深入理解TensorFlow变量

作者: 金色暗影 | 来源:发表于2017-11-10 22:57 被阅读0次

本文主要讲解以下语法的区别：

tf.Variable
tf.get_variable
tf.variable_scope(<scope_name>)
tf.name_scope(<scope_name>)

创建变量

变量通过tf.Variable类来操纵，可以通过实例化tf.Variable来创建变量，比如a = tf.Variable([1.0], name='a')。不过官方推荐的最佳创建方式是通过调用tf.get_variable来隐式的创建，这个函数要求你指定变量的名称，这个名称将作为副本用于访问相同的变量，以及在检查点和导出模型时命名次变量的值。tf.get_variable还允许你重复使用先前创建的同名变量，从而轻松定义复用层的模型。而直接通过tf.Variable来创建变量的话无论什么时候都会创建新的变量。

下面通过官方文档中的几个例子来说明。

创建一个形状是[1,2,3]名称为my_variable的变量，默认数据类型是tf.float32,并且默认数值将通过tf.glorot_uniform_initializer被随机初始化。

my_variable = tf.get_variable("my_variable", [1, 2, 3])

我们可以自己指定数据的类型和初始化器。TensorFlow提供了很多的初始化器（这个自己看api文档），还可以直接指定初始值，像这个样子：

my_int_variable = tf.get_variable("my_int_variable", [1, 2, 3], dtype=tf.int32, 
  initializer=tf.zeros_initializer)
other_variable = tf.get_variable("other_variable", dtype=tf.int32, 
  initializer=tf.constant([23, 42]))

变量集合

默认情况下，每个tf.Variable被放置在以下两个集合中：

tf.GraphKeys.GLOBAL_VARIABLES -- 能在多设备上共享

tf.GraphKeys.TRAINABLE_VARIABLES -- 会计算梯度的变量

如果你希望某个变量不要被训练，那可以放在这个集合里：tf.GraphKeys.LOCAL_VARIABLES

比如这样：

my_local = tf.get_variable("my_local", shape=(), 
collections=[tf.GraphKeys.LOCAL_VARIABLES])

或者这样，添加trainable为False

my_non_trainable = tf.get_variable("my_non_trainable", shape=(), trainable=False)

当然也可以创建自己的集合，像这样：

添加my_loacl变量到my_collection_name集合

tf.add_to_collection("my_collection_name", my_local)

取出集合中的变量list：

tf.get_collection("my_collection_name")

共享变量

我们在构造一些网络的时候，可能会遇到一个层多次利用或多个输入使用同一个层的情况，这种时候就需要重复利用同一套权重变量，不然就无法达到预期的效果，这个时候就可以通过变量作用域tf.variable_scope()和tf.get_variable配合来实现，下面继续拿官方文档上的例子来做说明。

例如，我们通过便编写一个函数来创建卷积层：

def conv_relu(input, kernel_shape, bias_shape):
    # Create variable named "weights".
    weights = tf.get_variable("weights", kernel_shape,
        initializer=tf.random_normal_initializer())
    # Create variable named "biases".
    biases = tf.get_variable("biases", bias_shape,
        initializer=tf.constant_initializer(0.0))
    conv = tf.nn.conv2d(input, weights,
        strides=[1, 1, 1, 1], padding='SAME')
    return tf.nn.relu(conv + biases)

在我们的真实模型中，有两个输入需要使用同一个卷积，于是你可能会想这么做：

input1 = tf.random_normal([1,10,10,32])
input2 = tf.random_normal([1,20,20,32])
x = conv_relu(input1, kernel_shape=[5, 5, 32, 32], bias_shape=[32])
x = conv_relu(x, kernel_shape=[5, 5, 32, 32], bias_shape = [32])  # This fails.

但是你很快就会发现，这是行不通的，第一次调用函数的时候就已经创建了weights和biases变量，第二次调用的时候变量名称已经存在，所以就无法再使用这两个名称了，而如果要复用与第一次想通的变量，就需要使用变量作用域，并且声明重复使用变量，像下面这样操作：（设置reuse=True）

with tf.variable_scope("model"):
  output1 = my_image_filter(input1)
with tf.variable_scope("model", reuse=True):
  output2 = my_image_filter(input2)

或者：(设置scope.reuse_variables())

with tf.variable_scope("model") as scope:
  output1 = my_image_filter(input1)
  scope.reuse_variables()
  output2 = my_image_filter(input2)

当然，我们可能遇到的更多情况是模型中有多次卷积操作，并且使用不同的变量，于是我们可以定义不同的变量作用域来实现：

def my_image_filter(input_images):
    with tf.variable_scope("conv1"):
        # Variables created here will be named "conv1/weights", "conv1/biases".
        relu1 = conv_relu(input_images, [5, 5, 32, 32], [32])
    with tf.variable_scope("conv2"):
        # Variables created here will be named "conv2/weights", "conv2/biases".
        return conv_relu(relu1, [5, 5, 32, 32], [32])

或者使用tf.Variable，这样每次调用conv_relu都会创建出不同的变量。

而名称作用域tf.name_scope并不会对tf.get_variable产生影响，只会对tf.Variable之类的其他的命名操作增加一个名称范围前缀。下面做个演示：

with tf.variable_scope("conv1"):
    a = tf.Variable([1.0], name='a')
with tf.variable_scope("conv2"):
    b = tf.Variable([1.0], name='a')
print(a)
print(b)
# 输出：
# <tf.Variable 'conv1/a:0' shape=(1,) dtype=float32_ref>
# <tf.Variable 'conv2/a:0' shape=(1,) dtype=float32_ref>

with tf.variable_scope("conv1"):
    a = tf.get_variable('a', [1])
with tf.variable_scope("conv2"):
    b = tf.get_variable('a', [1])
print(a)
print(b)
# 输出
# <tf.Variable 'conv1/a:0' shape=(1,) dtype=float32_ref>
# <tf.Variable 'conv2/a:0' shape=(1,) dtype=float32_ref>

with tf.variable_scope("conv1"):
    a = tf.get_variable('a', [1])
with tf.variable_scope("conv1"):
    b = tf.get_variable('a', [1])
print(a)
print(b)
# 报错
# ValueError: Variable conv1/a already exists, disallowed. Did you mean to set reuse=True in 
# VarScope? Originally defined at:

with tf.variable_scope("conv1"):
    a = tf.get_variable('a', [1])
with tf.variable_scope("conv1", reuse=True):
    b = tf.get_variable('a', [1])
print(a)
print(b)
# 输出
# <tf.Variable 'conv1/a:0' shape=(1,) dtype=float32_ref>
# <tf.Variable 'conv1/a:0' shape=(1,) dtype=float32_ref>

with tf.name_scope("conv1"):
    a = tf.get_variable('a', [1])
with tf.name_scope("conv2"):
    b = tf.get_variable('a', [1])
print(a)
print(b)
# 报错
# ValueError: Variable a already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:

with tf.name_scope("conv1"):
    a = tf.get_variable('a', [1])
print(a)
# 输出 (可以看出名称域对它没有影响)
# <tf.Variable 'a:0' shape=(1,) dtype=float32_ref>

网友评论

本文标题：深入理解TensorFlow变量

本文链接：https://www.haomeiwen.com/subject/tkgimxtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！