本文是对官方文档 的学习笔记。
相比较 Sequence, Functional API是一种比较复杂,但是功能强大的模型构建方式。 它不但可以支持 Sequence 所有的功能,而且它还可以支持share layer, 非线性拓扑, 多输入, 多输入以及类似ResNet 那种复杂的网络模型。
Kears 对Deep Learning 的设计理念是,整个Model 就是一个 有向无环图 DAG (directed acyclic graph), 而建模的目的就是构建出这个图。
一个简单的用函数构建模型的例子
x = layers.Dense(64, activation="relu")(x)
outputs = layers.Dense(10)(x)
两种查看模型的方法
summary
model.summary()
plot_model
keras.utils.plot_model(model, "my_first_model_with_shape_info.png", show_shapes=True)
Save/Load 模型
按如下方法保持的模型会包含:
- 模型结构
- 权重
- 传给 complie 函数参数中包括的内容
- 优化、以及状态
model.save("path_to_my_model")
del model
# Recreate the exact same model purely from the file:
model = keras.models.load_model("path_to_my_model")
用简单模型组成复合模型
Kears 中模型就类似一个函数, 有输出,输出。 既然是函数, 那么就可以吧几个函数组合起来使用。 下面就是一个encoder 和 decoder 的组合。
encoder_input = keras.Input(shape=(28, 28, 1), name="original_img")
x = layers.Conv2D(16, 3, activation="relu")(encoder_input)
x = layers.Conv2D(32, 3, activation="relu")(x)
x = layers.MaxPooling2D(3)(x)
x = layers.Conv2D(32, 3, activation="relu")(x)
x = layers.Conv2D(16, 3, activation="relu")(x)
encoder_output = layers.GlobalMaxPooling2D()(x)
encoder = keras.Model(encoder_input, encoder_output, name="encoder")
encoder.summary()
decoder_input = keras.Input(shape=(16,), name="encoded_img")
x = layers.Reshape((4, 4, 1))(decoder_input)
x = layers.Conv2DTranspose(16, 3, activation="relu")(x)
x = layers.Conv2DTranspose(32, 3, activation="relu")(x)
x = layers.UpSampling2D(3)(x)
x = layers.Conv2DTranspose(16, 3, activation="relu")(x)
decoder_output = layers.Conv2DTranspose(1, 3, activation="relu")(x)
decoder = keras.Model(decoder_input, decoder_output, name="decoder")
decoder.summary()
autoencoder_input = keras.Input(shape=(28, 28, 1), name="img")
encoded_img = encoder(autoencoder_input)
decoded_img = decoder(encoded_img)
autoencoder = keras.Model(autoencoder_input, decoded_img, name="autoencoder")
autoencoder.summary()
既然模型可以当成函数, 那么就以出现一个模型内部有多个模型的情况。 这就可以实现集成学习的模型,如下:
def get_model():
inputs = keras.Input(shape=(128,))
outputs = layers.Dense(1)(inputs)
return keras.Model(inputs, outputs)
model1 = get_model()
model2 = get_model()
model3 = get_model()
inputs = keras.Input(shape=(128,))
y1 = model1(inputs)
y2 = model2(inputs)
y3 = model3(inputs)
outputs = layers.average([y1, y2, y3])
ensemble_model = keras.Model(inputs=inputs, outputs=outputs)
构建复杂模型
利用 Funcational API 可以构建复杂的模型,比如这样一个模型场景:
将客户提交的投诉按照严重程度打分,并将投诉分配到相应的部门。 这个模型包括3个输入:
- 投诉标题
- 投诉正文文本
- 标签
而这个模型会有2个输出
- 处理优先级
- 应该分派到哪个部门
num_tags = 12 # Number of unique issue tags
num_words = 10000 # Size of vocabulary obtained when preprocessing text data
num_departments = 4 # Number of departments for predictions
title_input = keras.Input(
shape=(None,), name="title"
) # Variable-length sequence of ints
body_input = keras.Input(shape=(None,), name="body") # Variable-length sequence of ints
tags_input = keras.Input(
shape=(num_tags,), name="tags"
) # Binary vectors of size `num_tags`
# Embed each word in the title into a 64-dimensional vector
title_features = layers.Embedding(num_words, 64)(title_input)
# Embed each word in the text into a 64-dimensional vector
body_features = layers.Embedding(num_words, 64)(body_input)
# Reduce sequence of embedded words in the title into a single 128-dimensional vector
title_features = layers.LSTM(128)(title_features)
# Reduce sequence of embedded words in the body into a single 32-dimensional vector
body_features = layers.LSTM(32)(body_features)
# Merge all available features into a single large vector via concatenation
x = layers.concatenate([title_features, body_features, tags_input])
# Stick a logistic regression for priority prediction on top of the features
priority_pred = layers.Dense(1, name="priority")(x)
# Stick a department classifier on top of the features
department_pred = layers.Dense(num_departments, name="department")(x)
# Instantiate an end-to-end model predicting both priority and department
model = keras.Model(
inputs=[title_input, body_input, tags_input],
outputs=[priority_pred, department_pred],
)
keras.utils.plot_model(model, "multi_input_and_output_model.png", show_shapes=True)
image.png
因为有多个输出, 所以也会有多个损失函数。 可以根据Layer 那么分派loss 函数, 另外甚至可以对不同的输出配上不同的权重。
model.compile(
optimizer=keras.optimizers.RMSprop(1e-3),
loss={
"priority": keras.losses.BinaryCrossentropy(from_logits=True),
"department": keras.losses.CategoricalCrossentropy(from_logits=True),
},
loss_weights=[1.0, 0.2],
)
输入端也需要对多输入进行配置
# Dummy input data
title_data = np.random.randint(num_words, size=(1280, 10))
body_data = np.random.randint(num_words, size=(1280, 100))
tags_data = np.random.randint(2, size=(1280, num_tags)).astype("float32")
# Dummy target data
priority_targets = np.random.random(size=(1280, 1))
dept_targets = np.random.randint(2, size=(1280, num_departments))
model.fit(
{"title": title_data, "body": body_data, "tags": tags_data},
{"priority": priority_targets, "department": dept_targets},
epochs=2,
batch_size=32,
)
简单的 ResNet 模型
不同与多输入输出, ResNet 在网络的拓扑结构上与简单Sequence 模型不同, 这里给出了一个 用Keras 实现的 ResNet 例子。
inputs = keras.Input(shape=(32, 32, 3), name="img")
x = layers.Conv2D(32, 3, activation="relu")(inputs)
x = layers.Conv2D(64, 3, activation="relu")(x)
block_1_output = layers.MaxPooling2D(3)(x)
x = layers.Conv2D(64, 3, activation="relu", padding="same")(block_1_output)
x = layers.Conv2D(64, 3, activation="relu", padding="same")(x)
block_2_output = layers.add([x, block_1_output])
x = layers.Conv2D(64, 3, activation="relu", padding="same")(block_2_output)
x = layers.Conv2D(64, 3, activation="relu", padding="same")(x)
block_3_output = layers.add([x, block_2_output])
x = layers.Conv2D(64, 3, activation="relu")(block_3_output)
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(256, activation="relu")(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(10)(x)
model = keras.Model(inputs, outputs, name="toy_resnet")
keras.utils.plot_model(model, "mini_resnet.png", show_shapes=True)
image.png
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)
model.compile(
optimizer=keras.optimizers.RMSprop(1e-3),
loss=keras.losses.CategoricalCrossentropy(from_logits=True),
metrics=["acc"],
)
# We restrict the data to the first 1000 samples so as to limit execution time
# on Colab. Try to train on the entire dataset until convergence!
model.fit(x_train[:1000], y_train[:1000], batch_size=64, epochs=1, validation_split=0.2)
Shared layers
在一个复杂模型内部, 如果几个通路会用到相同的层, 那么他们就可以共享一个layer。例如对text 的 embedding。
# Embedding for 1000 unique words mapped to 128-dimensional vectors
shared_embedding = layers.Embedding(1000, 128)
# Variable-length sequence of integers
text_input_a = keras.Input(shape=(None,), dtype="int32")
# Variable-length sequence of integers
text_input_b = keras.Input(shape=(None,), dtype="int32")
# Reuse the same layer to encode both inputs
encoded_input_a = shared_embedding(text_input_a)
encoded_input_b = shared_embedding(text_input_b)
对模型组件的复用
一个模型其实就是一个静态的数据结构, 所以可以对其中的组件单独拿出来复用。 比如,拿出来 VGG19 中的中间层, 就可以做特征提取。
vgg19 = tf.keras.applications.VGG19()
features_list = [layer.output for layer in vgg19.layers]
feat_extraction_model = keras.Model(inputs=vgg19.input, outputs=features_list)
img = np.random.random((1, 224, 224, 3)).astype("float32")
extracted_features = feat_extraction_model(img)
定制化 Layer
TF 2.4 中 Keras 包含如下 layer
- Convolutional layers:
Conv1D
,Conv2D
,Conv3D
,Conv2DTranspose
- Pooling layers:
MaxPooling1D
,MaxPooling2D
,MaxPooling3D
,AveragePooling1D
- RNN layers:
GRU
,LSTM
,ConvLSTM2D
-
BatchNormalization
,Dropout
,Embedding
, etc.
但是,如果开发者想定制自己的layer 也可以, 详情可参考 custom layers and models。
这里给了一个例子:
class CustomDense(layers.Layer):
def __init__(self, units=32):
super(CustomDense, self).__init__()
self.units = units
def build(self, input_shape):
self.w = self.add_weight(
shape=(input_shape[-1], self.units),
initializer="random_normal",
trainable=True,
)
self.b = self.add_weight(
shape=(self.units,), initializer="random_normal", trainable=True
)
def call(self, inputs):
return tf.matmul(inputs, self.w) + self.b
inputs = keras.Input((4,))
outputs = CustomDense(10)(inputs)
model = keras.Model(inputs, outputs)
class CustomDense(layers.Layer):
def __init__(self, units=32):
super(CustomDense, self).__init__()
self.units = units
def build(self, input_shape):
self.w = self.add_weight(
shape=(input_shape[-1], self.units),
initializer="random_normal",
trainable=True,
)
self.b = self.add_weight(
shape=(self.units,), initializer="random_normal", trainable=True
)
def call(self, inputs):
return tf.matmul(inputs, self.w) + self.b
def get_config(self):
return {"units": self.units}
inputs = keras.Input((4,))
outputs = CustomDense(10)(inputs)
model = keras.Model(inputs, outputs)
config = model.get_config()
new_model = keras.Model.from_config(config, custom_objects={"CustomDense": CustomDense})
def from_config(cls, config):
return cls(**config)
什么时候使用 Functional API
Functional API 是建立在DAG 基础上了, 所以它只能够建立静态的模型。 对于一些动态模型, 比如 Tree-RNN , 对于这种动态模型, 就需要使用继承 Keras Model, 实现一个子类来完成。
Functional API 对比 继承 Keras Model类 的优势
以下优势同样适用于 Sequence 方式
- 语法简洁
- 对输入检查严格,报错友好,及时
- 构建完成后,就可以plot, 或者summary 一个model
- 可以静态的提取中间层
- 可序列化,可克隆
# Functional API 对比 继承 Keras Model类 的劣势
- 不支持动态结构
**Note: ** Functional API 对比 继承 Keras Model类, 2种实现的的差别就在于一个是静态Model, 一个是动态 Model。
混合模式
Keras Model类, Functional API , Sequence 并不是水火不容, 而是可以互相搭配使用。
例如: [更多内容]
units = 32
timesteps = 10
input_dim = 5
# Define a Functional model
inputs = keras.Input((None, units))
x = layers.GlobalAveragePooling1D()(inputs)
outputs = layers.Dense(1)(x)
model = keras.Model(inputs, outputs)
class CustomRNN(layers.Layer):
def __init__(self):
super(CustomRNN, self).__init__()
self.units = units
self.projection_1 = layers.Dense(units=units, activation="tanh")
self.projection_2 = layers.Dense(units=units, activation="tanh")
# Our previously-defined Functional model
self.classifier = model
def call(self, inputs):
outputs = []
state = tf.zeros(shape=(inputs.shape[0], self.units))
for t in range(inputs.shape[1]):
x = inputs[:, t, :]
h = self.projection_1(x)
y = h + self.projection_2(state)
state = y
outputs.append(y)
features = tf.stack(outputs, axis=1)
print(features.shape)
return self.classifier(features)
rnn_model = CustomRNN()
_ = rnn_model(tf.zeros((1, timesteps, input_dim)))
网友评论