美文网首页程序员
TF2 Keras (3) : 使用Keras内建函数训练与评估

TF2 Keras (3) : 使用Keras内建函数训练与评估

作者: 数科每日 | 来源:发表于2021-01-07 08:29 被阅读0次

    本文是对官方文档 的学习笔记。


    Tips: 在英文文档中, 模型预测步骤有时候称为 prediction ,有时候称为inference

    这篇文章主要讲述如何用 Keras内建函数 进行模型训练,评估,预测。 如果想了解详细内容,或者自己开发一套训练流程可以参考:

    另外, 这里不涉及分布式计算, 如果想了解分布式训练,可以参考:

    概览

    对于 Keras, 输入数据的类型一般是:

    • numpy arrays
    • tf.data Dataset
      这里用MINST 举例, 如何完成一个简单模型的搭建,训练以及评估流程。
    inputs = keras.Input(shape=(784,), name="digits")
    x = layers.Dense(64, activation="relu", name="dense_1")(inputs)
    x = layers.Dense(64, activation="relu", name="dense_2")(x)
    outputs = layers.Dense(10, activation="softmax", name="predictions")(x)
    
    model = keras.Model(inputs=inputs, outputs=outputs)
    
    (x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
    
    # Preprocess the data (these are NumPy arrays)
    x_train = x_train.reshape(60000, 784).astype("float32") / 255
    x_test = x_test.reshape(10000, 784).astype("float32") / 255
    
    y_train = y_train.astype("float32")
    y_test = y_test.astype("float32")
    
    # Reserve 10,000 samples for validation
    x_val = x_train[-10000:]
    y_val = y_train[-10000:]
    x_train = x_train[:-10000]
    y_train = y_train[:-10000]
    
    model.compile(
        optimizer=keras.optimizers.RMSprop(),  # Optimizer
        # Loss function to minimize
        loss=keras.losses.SparseCategoricalCrossentropy(),
        # List of metrics to monitor
        metrics=[keras.metrics.SparseCategoricalAccuracy()],
    )
    
    print("Fit model on training data")
    history = model.fit(
        x_train,
        y_train,
        batch_size=64,
        epochs=2,
        # We pass some validation for
        # monitoring validation loss and metrics
        # at the end of each epoch
        validation_data=(x_val, y_val),
    )
    
    # Evaluate the model on the test data using `evaluate`
    print("Evaluate on test data")
    results = model.evaluate(x_test, y_test, batch_size=128)
    print("test loss, test acc:", results)
    
    # Generate predictions (probabilities -- the output of the last layer)
    # on new data using `predict`
    print("Generate predictions for 3 samples")
    predictions = model.predict(x_test[:3])
    print("predictions shape:", predictions.shape)
    

    Compile() 函数

    Compile 函数需要确定3个因素

    • optimizer : 优化器,其有个常用参数就是 learning_rate (学习速率)
    • loss : 损失函数
    • metrics : 衡量指标
    model.compile(
        optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),
        loss=keras.losses.SparseCategoricalCrossentropy(),
        metrics=[keras.metrics.SparseCategoricalAccuracy()],
    )
    

    如果打算利用默认参数, 可以直接用 string 来制定以上三个因素

    model.compile(
        optimizer="rmsprop",
        loss="sparse_categorical_crossentropy",
        metrics=["sparse_categorical_accuracy"],
    )
    

    TF2 内置的 optimizer, loss 和 metrics

    • Optimizers:
      • SGD() (with or without momentum)
      • RMSprop()
      • Adam()
      • etc.
    • Losses:
      • MeanSquaredError()
      • KLDivergence()
      • CosineSimilarity()
      • etc.
    • Metrics:
      -AUC()
      -Precision()
      -Recall()
      -etc.

    自定义Loss (损失函数)

    Loss 函数在模型的训练中至关重要,他最终决定了模型优化的走向。 Keras 允许用户自定义loss 函数,以增加灵活性。根据 自定义 loss需要的参数, 有2种方式可以自定义loss:

    • 只需要Model 预测值与lable
    • 除了 predict value 与 lable 以外, 还有其他参数

    只需要Model 预测值与lable

    这种情况最简单, 只需要实现一个函数就可以:

    def custom_mean_squared_error(y_true, y_pred):
        return tf.math.reduce_mean(tf.square(y_true - y_pred))
    
    
    model = get_uncompiled_model()
    model.compile(optimizer=keras.optimizers.Adam(), loss=custom_mean_squared_error)
    
    # We need to one-hot encode the labels to use MSE
    y_train_one_hot = tf.one_hot(y_train, depth=10)
    model.fit(x_train, y_train_one_hot, batch_size=64, epochs=1)
    

    除了 predict value 与 lable 以外, 还有其他参数
    这需要自己实现一个 Loss类, 继承自 tf.keras.losses.Loss 并实现以下2个函数

    • init(self)
    • call(self, y_true, y_pred)

    例子

    class CustomMSE(keras.losses.Loss):
        def __init__(self, regularization_factor=0.1, name="custom_mse"):
            super().__init__(name=name)
            self.regularization_factor = regularization_factor
    
        def call(self, y_true, y_pred):
            mse = tf.math.reduce_mean(tf.square(y_true - y_pred))
            reg = tf.math.reduce_mean(tf.square(0.5 - y_pred))
            return mse + reg * self.regularization_factor
    
    
    model = get_uncompiled_model()
    model.compile(optimizer=keras.optimizers.Adam(), loss=CustomMSE())
    
    y_train_one_hot = tf.one_hot(y_train, depth=10)
    model.fit(x_train, y_train_one_hot, batch_size=64, epochs=1)
    

    Custom metrics

    同样的,开发者也可以自己定制Metric , 不过这个就只能去继承tf.keras.metrics.Metric
    类了。

    例子:

    class CategoricalTruePositives(keras.metrics.Metric):
        def __init__(self, name="categorical_true_positives", **kwargs):
            super(CategoricalTruePositives, self).__init__(name=name, **kwargs)
            self.true_positives = self.add_weight(name="ctp", initializer="zeros")
    
        def update_state(self, y_true, y_pred, sample_weight=None):
            y_pred = tf.reshape(tf.argmax(y_pred, axis=1), shape=(-1, 1))
            values = tf.cast(y_true, "int32") == tf.cast(y_pred, "int32")
            values = tf.cast(values, "float32")
            if sample_weight is not None:
                sample_weight = tf.cast(sample_weight, "float32")
                values = tf.multiply(values, sample_weight)
            self.true_positives.assign_add(tf.reduce_sum(values))
    
        def result(self):
            return self.true_positives
    
        def reset_states(self):
            # The state of the metric will be reset at the start of each epoch.
            self.true_positives.assign(0.0)
    
    
    model = get_uncompiled_model()
    model.compile(
        optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),
        loss=keras.losses.SparseCategoricalCrossentropy(),
        metrics=[CategoricalTruePositives()],
    )
    model.fit(x_train, y_train, batch_size=64, epochs=3)
    

    针对某一层添加 Loss 和 metric

    Complie 函数制定的 Loss 和Metric 都是针对输出层的, 但是Keras 也允许用户针对某一层添加 Loss 和 Metric 。 不论是使用 Model 类来构建模型, 还是用Functional API 的方式来构建模型, 都可以

    Model

    在Call 函数中增加 self.add_loss(loss_value)

    class ActivityRegularizationLayer(layers.Layer):
        def call(self, inputs):
            self.add_loss(tf.reduce_sum(inputs) * 0.1)
            return inputs  # Pass-through layer.
    
    
    inputs = keras.Input(shape=(784,), name="digits")
    x = layers.Dense(64, activation="relu", name="dense_1")(inputs)
    
    # Insert activity regularization as a layer
    x = ActivityRegularizationLayer()(x)
    
    x = layers.Dense(64, activation="relu", name="dense_2")(x)
    outputs = layers.Dense(10, name="predictions")(x)
    
    model = keras.Model(inputs=inputs, outputs=outputs)
    model.compile(
        optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),
        loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    )
    
    # The displayed loss will be much higher than before
    # due to the regularization component.
    model.fit(x_train, y_train, batch_size=64, epochs=1)
    
    class MetricLoggingLayer(layers.Layer):
        def call(self, inputs):
            # The `aggregation` argument defines
            # how to aggregate the per-batch values
            # over each epoch:
            # in this case we simply average them.
            self.add_metric(
                keras.backend.std(inputs), name="std_of_activation", aggregation="mean"
            )
            return inputs  # Pass-through layer.
    
    
    inputs = keras.Input(shape=(784,), name="digits")
    x = layers.Dense(64, activation="relu", name="dense_1")(inputs)
    
    # Insert std logging as a layer.
    x = MetricLoggingLayer()(x)
    
    x = layers.Dense(64, activation="relu", name="dense_2")(x)
    outputs = layers.Dense(10, name="predictions")(x)
    
    model = keras.Model(inputs=inputs, outputs=outputs)
    model.compile(
        optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),
        loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    )
    model.fit(x_train, y_train, batch_size=64, epochs=1)
    

    Functional API

    inputs = keras.Input(shape=(784,), name="digits")
    x1 = layers.Dense(64, activation="relu", name="dense_1")(inputs)
    x2 = layers.Dense(64, activation="relu", name="dense_2")(x1)
    outputs = layers.Dense(10, name="predictions")(x2)
    model = keras.Model(inputs=inputs, outputs=outputs)
    
    model.add_loss(tf.reduce_sum(x1) * 0.1)
    
    model.add_metric(keras.backend.std(x1), name="std_of_activation", aggregation="mean")
    
    model.compile(
        optimizer=keras.optimizers.RMSprop(1e-3),
        loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    )
    model.fit(x_train, y_train, batch_size=64, epochs=1)
    

    如果已经制定了Loss 函数, 那么在Complie 函数中可以省略 loss 参数。 例如还在一个多输入模型中, 制定loss 以后,在Complie 的时候, 就可以不写 loss 参数:

    import numpy as np
    
    inputs = keras.Input(shape=(3,), name="inputs")
    targets = keras.Input(shape=(10,), name="targets")
    logits = keras.layers.Dense(10)(inputs)
    predictions = LogisticEndpoint(name="predictions")(logits, targets)
    
    model = keras.Model(inputs=[inputs, targets], outputs=predictions)
    model.compile(optimizer="adam")  # No loss argument!
    
    data = {
        "inputs": np.random.random((3, 3)),
        "targets": np.random.random((3, 10)),
    }
    model.fit(data)
    

    自动分离Training 和 Validation data

    Complie 函数可以自动的把数据分割为训练集和验证集。只需要使用 validation_split 即可:

    注意:如果要使用这个功能, 输入数据必须是 numpy arrays

    model = get_compiled_model()
    model.fit(x_train, y_train, batch_size=64, validation_split=0.2, epochs=1)
    

    训练与评估

    一个训练与评估模板

    model = get_compiled_model()
    
    # First, let's create a training Dataset instance.
    # For the sake of our example, we'll use the same MNIST data as before.
    train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
    # Shuffle and slice the dataset.
    train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)
    
    # Now we get a test dataset.
    test_dataset = tf.data.Dataset.from_tensor_slices((x_test, y_test))
    test_dataset = test_dataset.batch(64)
    
    # Since the dataset already takes care of batching,
    # we don't pass a `batch_size` argument.
    model.fit(train_dataset, epochs=3)
    
    # You can also evaluate or predict on a dataset.
    print("Evaluate")
    result = model.evaluate(test_dataset)
    dict(zip(model.metrics_names, result))
    

    这个例子中, model.fit 函数没有指定batch size, 原因是 Dataset 已经制定了batch size = 64. epochs=3 在这里的意思是, 执行3轮(epoch)。 每轮都会所有数据,按照 64个一批的方法, 训练一遍。 也就是说每轮训练都会用完所有的数据,每轮结束后Dataset 会被自动重置, 以便下一轮使用。

    steps_per_epoch

    steps_per_epoch 是fit 的一个参数, 在网上也有人问他的意思, 在API 文档里到没有这里介绍的清楚。

    先上个例子

    model = get_compiled_model()
    
    # Prepare the training dataset
    train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
    train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)
    
    # Only use the 100 batches per epoch (that's 64 * 100 samples)
    model.fit(train_dataset, epochs=3, steps_per_epoch=100)
    

    这里 epoch 配合 steps_per_epoch 的意思是:

    • steps_per_epoch 指定了每轮训练 100 stpes, 这里 stpes 具体含义是 100 个batchs。 如果每个batch 包含64个samples ,那么就意味着每轮有 6400个样本参与训练。
    • epoch 这里意味着整个训练只训练3轮, 在这个配置里, 也就是说整个训练下来就使用 3* 100 * 64 个样本。 比上面一个例子中使用的样本少很多。
    • 一般来说不需要自己配置 steps_per_epoch 这个变量。

    使用 Validation Dataset

    上面例子中 fit 不包括 验证数据, 验证数据也可以放在 fit 中, 自动完成验证。

    model = get_compiled_model()
    
    # Prepare the training dataset
    train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
    train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)
    
    # Prepare the validation dataset
    val_dataset = tf.data.Dataset.from_tensor_slices((x_val, y_val))
    val_dataset = val_dataset.batch(64)
    
    model.fit(train_dataset, epochs=1, validation_data=val_dataset)
    

    validation_steps

    validation_steps 用来控制每次验证使用多少验证数据

    model = get_compiled_model()
    
    # Prepare the training dataset
    train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
    train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)
    
    # Prepare the validation dataset
    val_dataset = tf.data.Dataset.from_tensor_slices((x_val, y_val))
    val_dataset = val_dataset.batch(64)
    
    model.fit(
        train_dataset,
        epochs=1,
        # Only run validation using the first 10 batches of the dataset
        # using the `validation_steps` argument
        validation_data=val_dataset,
        validation_steps=10,
    )
    

    数据数据格式

    Tensorflow2 Keras 支持的数据格式有:

    • Numpy arrays : 适用小型数据,放在内存中
    • Dataset : 大型数据,可用于分布式
    • Sequence : 大型数据,且需要大量python 功能来处理数据 (keras.utils.Sequence

    keras.utils.Sequence

    sequence 需要继承 keras.utils.Sequence, 并实现相应函数。 它有2个特点:

    • 适用于多进程
    • 可以洗牌 shuffle

    需要实现2个函数

    • getitem
    • len
    from skimage.io import imread
    from skimage.transform import resize
    import numpy as np
    
    # Here, `filenames` is list of path to the images
    # and `labels` are the associated labels.
    
    class CIFAR10Sequence(Sequence):
        def __init__(self, filenames, labels, batch_size):
            self.filenames, self.labels = filenames, labels
            self.batch_size = batch_size
    
        def __len__(self):
            return int(np.ceil(len(self.filenames) / float(self.batch_size)))
    
        def __getitem__(self, idx):
            batch_x = self.filenames[idx * self.batch_size:(idx + 1) * self.batch_size]
            batch_y = self.labels[idx * self.batch_size:(idx + 1) * self.batch_size]
            return np.array([
                resize(imread(filename), (200, 200))
                   for filename in batch_x]), np.array(batch_y)
    
    sequence = CIFAR10Sequence(filenames, labels, batch_size)
    model.fit(sequence, epochs=10)
    

    样本权重 和 类型权重

    类型权重 Class weight

    对于分类任务来说, 可以对比较看重的类型,或者不平衡的数据进行调整, 这样可以避免 resampling。 训练方法会依据配置, 对特定类型给予“照顾”。

    例子: 加强识别 MINST 数据中的 5 的识别

    import numpy as np
    
    class_weight = {
        0: 1.0,
        1: 1.0,
        2: 1.0,
        3: 1.0,
        4: 1.0,
        # Set weight "2" for class "5",
        # making this class 2x more important
        5: 2.0,
        6: 1.0,
        7: 1.0,
        8: 1.0,
        9: 1.0,
    }
    
    print("Fit with class weight")
    model = get_compiled_model()
    model.fit(x_train, y_train, class_weight=class_weight, batch_size=64, epochs=1)
    

    样例权重 Sample weights

    如下, 对部分例子给予2.0 的权重, (默认为1.0)

    Numpy

    sample_weight = np.ones(shape=(len(y_train),))
    sample_weight[y_train == 5] = 2.0
    
    print("Fit with sample weight")
    model = get_compiled_model()
    model.fit(x_train, y_train, sample_weight=sample_weight, batch_size=64, epochs=1)
    

    Dataset

    sample_weight = np.ones(shape=(len(y_train),))
    sample_weight[y_train == 5] = 2.0
    
    # Create a Dataset that includes sample weights
    # (3rd element in the return tuple).
    train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train, sample_weight))
    
    # Shuffle and slice the dataset.
    train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)
    
    model = get_compiled_model()
    model.fit(train_dataset, epochs=1)
    

    传递数据给多输入,多输出模型

    image_input = keras.Input(shape=(32, 32, 3), name="img_input")
    timeseries_input = keras.Input(shape=(None, 10), name="ts_input")
    
    x1 = layers.Conv2D(3, 3)(image_input)
    x1 = layers.GlobalMaxPooling2D()(x1)
    
    x2 = layers.Conv1D(3, 3)(timeseries_input)
    x2 = layers.GlobalMaxPooling1D()(x2)
    
    x = layers.concatenate([x1, x2])
    
    score_output = layers.Dense(1, name="score_output")(x)
    class_output = layers.Dense(5, name="class_output")(x)
    
    model = keras.Model(
        inputs=[image_input, timeseries_input], outputs=[score_output, class_output]
    )
    
    keras.utils.plot_model(model, "multi_input_and_output_model.png", show_shapes=True)
    
    model.compile(
        optimizer=keras.optimizers.RMSprop(1e-3),
        loss=[keras.losses.MeanSquaredError(), keras.losses.CategoricalCrossentropy()],
    )
    
    image.png

    多 metrics

    model.compile(
        optimizer=keras.optimizers.RMSprop(1e-3),
        loss=[keras.losses.MeanSquaredError(), keras.losses.CategoricalCrossentropy()],
        metrics=[
            [
                keras.metrics.MeanAbsolutePercentageError(),
                keras.metrics.MeanAbsoluteError(),
            ],
            [keras.metrics.CategoricalAccuracy()],
        ],
    )
    

    由于layer 有name , 可以使用字典

    model.compile(
        optimizer=keras.optimizers.RMSprop(1e-3),
        loss={
            "score_output": keras.losses.MeanSquaredError(),
            "class_output": keras.losses.CategoricalCrossentropy(),
        },
        metrics={
            "score_output": [
                keras.metrics.MeanAbsolutePercentageError(),
                keras.metrics.MeanAbsoluteError(),
            ],
            "class_output": [keras.metrics.CategoricalAccuracy()],
        },
    )
    

    给予loss 不同的权重

    model.compile(
        optimizer=keras.optimizers.RMSprop(1e-3),
        loss={
            "score_output": keras.losses.MeanSquaredError(),
            "class_output": keras.losses.CategoricalCrossentropy(),
        },
        metrics={
            "score_output": [
                keras.metrics.MeanAbsolutePercentageError(),
                keras.metrics.MeanAbsoluteError(),
            ],
            "class_output": [keras.metrics.CategoricalAccuracy()],
        },
        loss_weights={"score_output": 2.0, "class_output": 1.0},
    )
    

    如果不是为特定输出计算loss ,output 指的是预测值,而不是训练中的值。

    # List loss version
    model.compile(
        optimizer=keras.optimizers.RMSprop(1e-3),
        loss=[None, keras.losses.CategoricalCrossentropy()],
    )
    
    # Or dict loss version
    model.compile(
        optimizer=keras.optimizers.RMSprop(1e-3),
        loss={"class_output": keras.losses.CategoricalCrossentropy()},
    )
    

    另外一种方式

    model.compile(
        optimizer=keras.optimizers.RMSprop(1e-3),
        loss=[keras.losses.MeanSquaredError(), keras.losses.CategoricalCrossentropy()],
    )
    
    # Generate dummy NumPy data
    img_data = np.random.random_sample(size=(100, 32, 32, 3))
    ts_data = np.random.random_sample(size=(100, 20, 10))
    score_targets = np.random.random_sample(size=(100, 1))
    class_targets = np.random.random_sample(size=(100, 5))
    
    # Fit on lists
    model.fit([img_data, ts_data], [score_targets, class_targets], batch_size=32, epochs=1)
    
    # Alternatively, fit on dicts
    model.fit(
        {"img_input": img_data, "ts_input": ts_data},
        {"score_output": score_targets, "class_output": class_targets},
        batch_size=32,
        epochs=1,
    )
    

    Dataset

    train_dataset = tf.data.Dataset.from_tensor_slices(
        (
            {"img_input": img_data, "ts_input": ts_data},
            {"score_output": score_targets, "class_output": class_targets},
        )
    )
    train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)
    
    model.fit(train_dataset, epochs=1)
    

    使用回调函数

    Keras 允许在不同的训练阶段使用回调函数,完成不同的功能:

    • 在不同的阶段计算validation
    • 每训练n轮,或者满足某种条件后(准确率超过某个值)创建Checkpoint
    • 当训练进入平台期修改学习率
    • 当训练进入平台期的时候做参数微调
    • 当训练完成或者发生某种事件时,发送通知 (SNS , email )
    model = get_compiled_model()
    
    callbacks = [
        keras.callbacks.EarlyStopping(
            # Stop training when `val_loss` is no longer improving
            monitor="val_loss",
            # "no longer improving" being defined as "no better than 1e-2 less"
            min_delta=1e-2,
            # "no longer improving" being further defined as "for at least 2 epochs"
            patience=2,
            verbose=1,
        )
    ]
    model.fit(
        x_train,
        y_train,
        epochs=20,
        batch_size=64,
        callbacks=callbacks,
        validation_split=0.2,
    )
    

    内建 Callbacks

    • ModelCheckpoint: 周期性创建 Checkpoint
    • EarlyStopping: 提前结束训练
    • TensorBoard: TensorBoard (more details in the section "Visualization").
    • CSVLogger: 把训练的损失值等信息写入csv
    • etc.

    自定义Callback

    通过继承 keras.callbacks.Callback
    来完成用户自定义 Callback
    详细参考

    class LossHistory(keras.callbacks.Callback):
        def on_train_begin(self, logs):
            self.per_batch_losses = []
    
        def on_batch_end(self, batch, logs):
            self.per_batch_losses.append(logs.get("loss"))
    

    创建Checkpoint

    当在相对大数据集上训练时, 为了避免浪费时间, 需要定期保存 Checkpoint 以便在出错的时候从 Checkpoint 恢复。

    model = get_compiled_model()
    
    callbacks = [
        keras.callbacks.ModelCheckpoint(
            # Path where to save the model
            # The two parameters below mean that we will overwrite
            # the current checkpoint if and only if
            # the `val_loss` score has improved.
            # The saved model name will include the current epoch.
            filepath="mymodel_{epoch}",
            save_best_only=True,  # Only save a model if `val_loss` has improved.
            monitor="val_loss",
            verbose=1,
        )
    ]
    model.fit(
        x_train, y_train, epochs=2, batch_size=64, callbacks=callbacks, validation_split=0.2
    )
    

    利用 ModelCheckpoint 来实现容错: 当训练出错时, 从最近一个Checkpoint 恢复:

    import os
    
    # Prepare a directory to store all the checkpoints.
    checkpoint_dir = "./ckpt"
    if not os.path.exists(checkpoint_dir):
        os.makedirs(checkpoint_dir)
    
    
    def make_or_restore_model():
        # Either restore the latest model, or create a fresh one
        # if there is no checkpoint available.
        checkpoints = [checkpoint_dir + "/" + name for name in os.listdir(checkpoint_dir)]
        if checkpoints:
            latest_checkpoint = max(checkpoints, key=os.path.getctime)
            print("Restoring from", latest_checkpoint)
            return keras.models.load_model(latest_checkpoint)
        print("Creating a new model")
        return get_compiled_model()
    
    
    model = make_or_restore_model()
    callbacks = [
        # This callback saves a SavedModel every 100 batches.
        # We include the training loss in the saved model name.
        keras.callbacks.ModelCheckpoint(
            filepath=checkpoint_dir + "/ckpt-loss={loss:.2f}", save_freq=100
        )
    ]
    model.fit(x_train, y_train, epochs=1, callbacks=callbacks)
    

    调整学习率

    一般来说, 随着学习进入后期, 需要将学习率调低(learning rate decay). 调整学习率有2中方式: 静态计划 (static schedule),动态计划(dynamic schedule)。

    static schedule

    将 schedule 作为optimizer 的参数

    initial_learning_rate = 0.1
    lr_schedule = keras.optimizers.schedules.ExponentialDecay(
        initial_learning_rate, decay_steps=100000, decay_rate=0.96, staircase=True
    )
    
    optimizer = keras.optimizers.RMSprop(learning_rate=lr_schedule)
    

    可选的内建 schedules

    • ExponentialDecay
    • PiecewiseConstantDecay
    • PolynomialDecay
    • InverseTimeDecay

    dynamic schedule

    动态调整需要使用 Callback 函数, 可以尝试使用内置的 ReduceLROnPlateau Callback。

    可视化训练过程

    使用 TensorBoard,

    如果想使用 tensorboard 需要先用 pip 安装, 之后可以用下面的方式启动 tensorboard 。

    tensorboard --logdir=/full_path_to_your_logs
    

    Tensorboard Callback

    使用 Tensorboard 最简单的方式是使用 Callback, 这里是一个简单的例子。

    keras.callbacks.TensorBoard(
        log_dir="/full_path_to_your_logs",
        histogram_freq=0,  # How often to log histogram visualizations
        embeddings_freq=0,  # How often to log embedding visualizations
        update_freq="epoch",
    )  # How often to write logs (default: once per epoch)
    

    相关文章

      网友评论

        本文标题:TF2 Keras (3) : 使用Keras内建函数训练与评估

        本文链接:https://www.haomeiwen.com/subject/ykynoktx.html