美文网首页
TensorFlow高阶操作之张量限幅

TensorFlow高阶操作之张量限幅

作者: 酵母小木 | 来源:发表于2020-02-04 18:28 被阅读0次

    【碎碎念】今天本来要看8节网课的,现在都下午4点了,才看了一节网课,其他时间都拿来想你了!fighting!!!

    1. 【clip_by_value】:根据值进行限幅

    • tf.maximum(tensor, thread):下限幅,值必须大于阈值 if data < thread, data = thread
    • tf.minimum(tensor, thread):上限幅,值必须小于阈值 if data > thread, data = thread
    • tf.clip_by_value(tensor, down_thread, up_thread):值必须在阈值之间
    In [49]: a = tf.range(10)
    Out[50]: <tf.Tensor: id=81, shape=(10,), dtype=int32, numpy=array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])>
    
    In [52]: tf.maximum(a, 2)
    Out[52]: <tf.Tensor: id=83, shape=(10,), dtype=int32, numpy=array([2, 2, 2, 3, 4, 5, 6, 7, 8, 9])>
    
    In [53]: tf.minimum(a, 8)
    Out[53]: <tf.Tensor: id=85, shape=(10,), dtype=int32, numpy=array([0, 1, 2, 3, 4, 5, 6, 7, 8, 8])>
    
    In [54]: tf.clip_by_value(a, 2, 8)
    Out[54]: <tf.Tensor: id=89, shape=(10,), dtype=int32, numpy=array([2, 2, 2, 3, 4, 5, 6, 7, 8, 8])>
    

    使用限幅函数实现RELU函数功能

    In [55]: a = tf.range(10)
    In [57]: a = a - 5
    Out[58]: <tf.Tensor: id=95, shape=(10,), dtype=int32, numpy=array([-5, -4, -3, -2, -1,  0,  1,  2,  3,  4])>
    //使用relu的可读性更强一些
    In [59]: tf.nn.relu(a)
    Out[59]: <tf.Tensor: id=96, shape=(10,), dtype=int32, numpy=array([0, 0, 0, 0, 0, 0, 1, 2, 3, 4])>
    
    In [60]: tf.maximum(a, 0)
    Out[60]: <tf.Tensor: id=98, shape=(10,), dtype=int32, numpy=array([0, 0, 0, 0, 0, 0, 1, 2, 3, 4])>
    

    2. 【clip_by_norm】:根据范数进行数据裁剪,实现等比例放缩,不改变方向

    In [61]: a = tf.random.normal([2, 2], mean=10)
    Out[62]: <tf.Tensor: id=104, shape=(2, 2), dtype=float32, numpy=
    array([[ 9.855437,  8.334126],
           [ 9.349087, 10.561935]], dtype=float32)>
    
    In [63]: tf.norm(a)
    Out[63]: <tf.Tensor: id=109, shape=(), dtype=float32, numpy=19.11929>
    
    In [64]: aa = tf.clip_by_norm(a, 15)
    Out[66]: <tf.Tensor: id=126, shape=(2, 2), dtype=float32, numpy=
    array([[7.7320633, 6.538522 ],
           [7.334807 , 8.2863455]], dtype=float32)>
    
    In [65]: tf.norm(aa)
    Out[65]: <tf.Tensor: id=131, shape=(), dtype=float32, numpy=14.999999>
    

    3.【Gradient clipping】:梯度下降

    实现梯度下降存在两大障碍:梯度爆炸和梯度弥散。梯度爆炸是由于梯度太大,导致前进的步长太快;梯度弥散是由于梯度太小导致的。而【Gradient clipping】实现了整体梯度等比例缩放,但是梯度方向不变,在一定程度上抑制了梯度爆炸和梯度弥散

    # 张量限幅实战
    import os
    os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
    
    import tensorflow as tf
    from tensorflow import keras
    from tensorflow.keras import datasets, layers, optimizers
    
    print(tf.__version__)
    
    (x, y), _ = datasets.mnist.load_data()
    x = tf.convert_to_tensor(x, dtype=tf.float32) / 50.
    y = tf.convert_to_tensor(y)
    
    y = tf.one_hot(y, depth=10)
    print('x:', x.shape, 'y:', y.shape)
    train_db = tf.data.Dataset.from_tensor_slices((x, y)).batch(128).repeat(30)
    sample = next(iter(train_db))
    print('sample:', sample[0].shape, sample[1].shape)
    
    
    # print(x[0], y[0])
    
    
    def main():
        # 784 => 512
        w1, b1 = tf.Variable(tf.random.truncated_normal([784, 512], stddev=0.1)), tf.Variable(tf.zeros([512]))
        # 512 => 256
        w2, b2 = tf.Variable(tf.random.truncated_normal([512, 256], stddev=0.1)), tf.Variable(tf.zeros([256]))
        # 256 => 10
        w3, b3 = tf.Variable(tf.random.truncated_normal([256, 10], stddev=0.1)), tf.Variable(tf.zeros([10]))
    
        optimizer = optimizers.SGD(lr=0.01)
    
        for step, (x, y) in enumerate(train_db):
    
            # [b, 28, 28] => [b, 784]
            x = tf.reshape(x, [-1, 784])
    
            with tf.GradientTape() as tape:
    
                # layer1.
                h1 = x @ w1 + b1
                h1 = tf.nn.relu(h1)
                # layer2
                h2 = h1 @ w2 + b2
                h2 = tf.nn.relu(h2)
                # output
                out = h2 @ w3 + b3
    
                # compute loss
                # [b, 10] - [b, 10]
                loss = tf.square(y - out)
                # [b, 10] => [b]
                loss = tf.reduce_mean(loss, axis=1)
                # [b] => scalar
                loss = tf.reduce_mean(loss)
    
            # compute gradient
            grads = tape.gradient(loss, [w1, b1, w2, b2, w3, b3])
            # print('==before==')
            # for g in grads:
            #     print(tf.norm(g))
    
            grads, _ = tf.clip_by_global_norm(grads, 15)
    
            # print('==after==')
            # for g in grads:
            #     print(tf.norm(g))
            # update w' = w - lr*grad
            optimizer.apply_gradients(zip(grads, [w1, b1, w2, b2, w3, b3]))
    
            if step % 100 == 0:
                print(step, 'loss:', float(loss))
    
    
    if __name__ == '__main__':
        main()
    

    【注】在PyCharm中运行上述代码的时候,记得将ipython关闭,不然会出现GPU显存不足的情况

    参考资料

    相关文章

      网友评论

          本文标题:TensorFlow高阶操作之张量限幅

          本文链接:https://www.haomeiwen.com/subject/diorxhtx.html