美文网首页
TF2 Kears (2) : 用函数构建模型

TF2 Kears (2) : 用函数构建模型

作者: 数科每日 | 来源:发表于2021-01-06 11:30 被阅读0次

本文是对官方文档 的学习笔记。


相比较 Sequence, Functional API是一种比较复杂,但是功能强大的模型构建方式。 它不但可以支持 Sequence 所有的功能,而且它还可以支持share layer, 非线性拓扑, 多输入, 多输入以及类似ResNet 那种复杂的网络模型。

Kears 对Deep Learning 的设计理念是,整个Model 就是一个 有向无环图 DAG (directed acyclic graph), 而建模的目的就是构建出这个图。

一个简单的用函数构建模型的例子

x = layers.Dense(64, activation="relu")(x)
outputs = layers.Dense(10)(x)

两种查看模型的方法

summary
model.summary()
plot_model
keras.utils.plot_model(model, "my_first_model_with_shape_info.png", show_shapes=True)

Save/Load 模型

按如下方法保持的模型会包含:

  • 模型结构
  • 权重
  • 传给 complie 函数参数中包括的内容
  • 优化、以及状态
model.save("path_to_my_model")
del model
# Recreate the exact same model purely from the file:
model = keras.models.load_model("path_to_my_model")

用简单模型组成复合模型

Kears 中模型就类似一个函数, 有输出,输出。 既然是函数, 那么就可以吧几个函数组合起来使用。 下面就是一个encoder 和 decoder 的组合。

encoder_input = keras.Input(shape=(28, 28, 1), name="original_img")
x = layers.Conv2D(16, 3, activation="relu")(encoder_input)
x = layers.Conv2D(32, 3, activation="relu")(x)
x = layers.MaxPooling2D(3)(x)
x = layers.Conv2D(32, 3, activation="relu")(x)
x = layers.Conv2D(16, 3, activation="relu")(x)
encoder_output = layers.GlobalMaxPooling2D()(x)

encoder = keras.Model(encoder_input, encoder_output, name="encoder")
encoder.summary()

decoder_input = keras.Input(shape=(16,), name="encoded_img")
x = layers.Reshape((4, 4, 1))(decoder_input)
x = layers.Conv2DTranspose(16, 3, activation="relu")(x)
x = layers.Conv2DTranspose(32, 3, activation="relu")(x)
x = layers.UpSampling2D(3)(x)
x = layers.Conv2DTranspose(16, 3, activation="relu")(x)
decoder_output = layers.Conv2DTranspose(1, 3, activation="relu")(x)

decoder = keras.Model(decoder_input, decoder_output, name="decoder")
decoder.summary()

autoencoder_input = keras.Input(shape=(28, 28, 1), name="img")
encoded_img = encoder(autoencoder_input)
decoded_img = decoder(encoded_img)
autoencoder = keras.Model(autoencoder_input, decoded_img, name="autoencoder")
autoencoder.summary()

既然模型可以当成函数, 那么就以出现一个模型内部有多个模型的情况。 这就可以实现集成学习的模型,如下:

def get_model():
    inputs = keras.Input(shape=(128,))
    outputs = layers.Dense(1)(inputs)
    return keras.Model(inputs, outputs)


model1 = get_model()
model2 = get_model()
model3 = get_model()

inputs = keras.Input(shape=(128,))
y1 = model1(inputs)
y2 = model2(inputs)
y3 = model3(inputs)
outputs = layers.average([y1, y2, y3])
ensemble_model = keras.Model(inputs=inputs, outputs=outputs)

构建复杂模型

利用 Funcational API 可以构建复杂的模型,比如这样一个模型场景:

将客户提交的投诉按照严重程度打分,并将投诉分配到相应的部门。 这个模型包括3个输入:

  • 投诉标题
  • 投诉正文文本
  • 标签

而这个模型会有2个输出

  • 处理优先级
  • 应该分派到哪个部门
num_tags = 12  # Number of unique issue tags
num_words = 10000  # Size of vocabulary obtained when preprocessing text data
num_departments = 4  # Number of departments for predictions

title_input = keras.Input(
    shape=(None,), name="title"
)  # Variable-length sequence of ints
body_input = keras.Input(shape=(None,), name="body")  # Variable-length sequence of ints
tags_input = keras.Input(
    shape=(num_tags,), name="tags"
)  # Binary vectors of size `num_tags`

# Embed each word in the title into a 64-dimensional vector
title_features = layers.Embedding(num_words, 64)(title_input)
# Embed each word in the text into a 64-dimensional vector
body_features = layers.Embedding(num_words, 64)(body_input)

# Reduce sequence of embedded words in the title into a single 128-dimensional vector
title_features = layers.LSTM(128)(title_features)
# Reduce sequence of embedded words in the body into a single 32-dimensional vector
body_features = layers.LSTM(32)(body_features)

# Merge all available features into a single large vector via concatenation
x = layers.concatenate([title_features, body_features, tags_input])

# Stick a logistic regression for priority prediction on top of the features
priority_pred = layers.Dense(1, name="priority")(x)
# Stick a department classifier on top of the features
department_pred = layers.Dense(num_departments, name="department")(x)

# Instantiate an end-to-end model predicting both priority and department
model = keras.Model(
    inputs=[title_input, body_input, tags_input],
    outputs=[priority_pred, department_pred],
)

keras.utils.plot_model(model, "multi_input_and_output_model.png", show_shapes=True)
image.png

因为有多个输出, 所以也会有多个损失函数。 可以根据Layer 那么分派loss 函数, 另外甚至可以对不同的输出配上不同的权重。

model.compile(
    optimizer=keras.optimizers.RMSprop(1e-3),
    loss={
        "priority": keras.losses.BinaryCrossentropy(from_logits=True),
        "department": keras.losses.CategoricalCrossentropy(from_logits=True),
    },
    loss_weights=[1.0, 0.2],
)

输入端也需要对多输入进行配置

# Dummy input data
title_data = np.random.randint(num_words, size=(1280, 10))
body_data = np.random.randint(num_words, size=(1280, 100))
tags_data = np.random.randint(2, size=(1280, num_tags)).astype("float32")

# Dummy target data
priority_targets = np.random.random(size=(1280, 1))
dept_targets = np.random.randint(2, size=(1280, num_departments))

model.fit(
    {"title": title_data, "body": body_data, "tags": tags_data},
    {"priority": priority_targets, "department": dept_targets},
    epochs=2,
    batch_size=32,
)

简单的 ResNet 模型

不同与多输入输出, ResNet 在网络的拓扑结构上与简单Sequence 模型不同, 这里给出了一个 用Keras 实现的 ResNet 例子。

inputs = keras.Input(shape=(32, 32, 3), name="img")
x = layers.Conv2D(32, 3, activation="relu")(inputs)
x = layers.Conv2D(64, 3, activation="relu")(x)
block_1_output = layers.MaxPooling2D(3)(x)

x = layers.Conv2D(64, 3, activation="relu", padding="same")(block_1_output)
x = layers.Conv2D(64, 3, activation="relu", padding="same")(x)
block_2_output = layers.add([x, block_1_output])

x = layers.Conv2D(64, 3, activation="relu", padding="same")(block_2_output)
x = layers.Conv2D(64, 3, activation="relu", padding="same")(x)
block_3_output = layers.add([x, block_2_output])

x = layers.Conv2D(64, 3, activation="relu")(block_3_output)
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(256, activation="relu")(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(10)(x)

model = keras.Model(inputs, outputs, name="toy_resnet")
 
keras.utils.plot_model(model, "mini_resnet.png", show_shapes=True)
image.png
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()

x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)

model.compile(
    optimizer=keras.optimizers.RMSprop(1e-3),
    loss=keras.losses.CategoricalCrossentropy(from_logits=True),
    metrics=["acc"],
)
# We restrict the data to the first 1000 samples so as to limit execution time
# on Colab. Try to train on the entire dataset until convergence!
model.fit(x_train[:1000], y_train[:1000], batch_size=64, epochs=1, validation_split=0.2)

Shared layers

在一个复杂模型内部, 如果几个通路会用到相同的层, 那么他们就可以共享一个layer。例如对text 的 embedding。

# Embedding for 1000 unique words mapped to 128-dimensional vectors
shared_embedding = layers.Embedding(1000, 128)

# Variable-length sequence of integers
text_input_a = keras.Input(shape=(None,), dtype="int32")

# Variable-length sequence of integers
text_input_b = keras.Input(shape=(None,), dtype="int32")

# Reuse the same layer to encode both inputs
encoded_input_a = shared_embedding(text_input_a)
encoded_input_b = shared_embedding(text_input_b)

对模型组件的复用

一个模型其实就是一个静态的数据结构, 所以可以对其中的组件单独拿出来复用。 比如,拿出来 VGG19 中的中间层, 就可以做特征提取。

vgg19 = tf.keras.applications.VGG19()

features_list = [layer.output for layer in vgg19.layers]

feat_extraction_model = keras.Model(inputs=vgg19.input, outputs=features_list)

img = np.random.random((1, 224, 224, 3)).astype("float32")
extracted_features = feat_extraction_model(img)

定制化 Layer

TF 2.4 中 Keras 包含如下 layer

  • Convolutional layers: Conv1D, Conv2D, Conv3D, Conv2DTranspose
  • Pooling layers: MaxPooling1D, MaxPooling2D, MaxPooling3D, AveragePooling1D
  • RNN layers: GRU, LSTM, ConvLSTM2D
  • BatchNormalization, Dropout, Embedding, etc.

但是,如果开发者想定制自己的layer 也可以, 详情可参考 custom layers and models
这里给了一个例子:

class CustomDense(layers.Layer):
    def __init__(self, units=32):
        super(CustomDense, self).__init__()
        self.units = units

    def build(self, input_shape):
        self.w = self.add_weight(
            shape=(input_shape[-1], self.units),
            initializer="random_normal",
            trainable=True,
        )
        self.b = self.add_weight(
            shape=(self.units,), initializer="random_normal", trainable=True
        )

    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b


inputs = keras.Input((4,))
outputs = CustomDense(10)(inputs)

model = keras.Model(inputs, outputs)
class CustomDense(layers.Layer):
    def __init__(self, units=32):
        super(CustomDense, self).__init__()
        self.units = units

    def build(self, input_shape):
        self.w = self.add_weight(
            shape=(input_shape[-1], self.units),
            initializer="random_normal",
            trainable=True,
        )
        self.b = self.add_weight(
            shape=(self.units,), initializer="random_normal", trainable=True
        )

    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b

    def get_config(self):
        return {"units": self.units}


inputs = keras.Input((4,))
outputs = CustomDense(10)(inputs)

model = keras.Model(inputs, outputs)
config = model.get_config()

new_model = keras.Model.from_config(config, custom_objects={"CustomDense": CustomDense})

def from_config(cls, config):
  return cls(**config)

什么时候使用 Functional API

Functional API 是建立在DAG 基础上了, 所以它只能够建立静态的模型。 对于一些动态模型, 比如 Tree-RNN , 对于这种动态模型, 就需要使用继承 Keras Model, 实现一个子类来完成。

Functional API 对比 继承 Keras Model类 的优势

以下优势同样适用于 Sequence 方式

  • 语法简洁
  • 对输入检查严格,报错友好,及时
  • 构建完成后,就可以plot, 或者summary 一个model
  • 可以静态的提取中间层
  • 可序列化,可克隆

# Functional API 对比 继承 Keras Model类 的劣势

  • 不支持动态结构

**Note: ** Functional API 对比 继承 Keras Model类, 2种实现的的差别就在于一个是静态Model, 一个是动态 Model。

混合模式

Keras Model类, Functional API , Sequence 并不是水火不容, 而是可以互相搭配使用。

例如: [更多内容]

units = 32
timesteps = 10
input_dim = 5

# Define a Functional model
inputs = keras.Input((None, units))
x = layers.GlobalAveragePooling1D()(inputs)
outputs = layers.Dense(1)(x)
model = keras.Model(inputs, outputs)


class CustomRNN(layers.Layer):
    def __init__(self):
        super(CustomRNN, self).__init__()
        self.units = units
        self.projection_1 = layers.Dense(units=units, activation="tanh")
        self.projection_2 = layers.Dense(units=units, activation="tanh")
        # Our previously-defined Functional model
        self.classifier = model

    def call(self, inputs):
        outputs = []
        state = tf.zeros(shape=(inputs.shape[0], self.units))
        for t in range(inputs.shape[1]):
            x = inputs[:, t, :]
            h = self.projection_1(x)
            y = h + self.projection_2(state)
            state = y
            outputs.append(y)
        features = tf.stack(outputs, axis=1)
        print(features.shape)
        return self.classifier(features)


rnn_model = CustomRNN()
_ = rnn_model(tf.zeros((1, timesteps, input_dim)))

相关文章

网友评论

      本文标题:TF2 Kears (2) : 用函数构建模型

      本文链接:https://www.haomeiwen.com/subject/ijxnoktx.html