学习自中国大学MOOC TensorFlow学习课程
一、循环神经网络RNN的结构
神经网络是一个特殊的模型
通过数据和标签推导出规则,像函数:
但没有考虑输入数据之间的相互联系
所以当单词被分为子词时,上下文难以理解单子子词的含义。因此子词出现顺序,对理解单词含义非常重要
斐波拉契数列:对应RNN网络:
单个循环神经网络的神经元:
多个神经元的组合:
更多RNN资料:
但这个模型有一个问题:就是没办法理解上下文含义,例如:
于是一种更先进的循环神经网络结构——LSTM被提出,用来分析文本的上下文含义:
除了前面RNN的标准的序列信息以外,还加入了cell state结构来实现长期记忆,来解决这个问题:
cell state 记忆可以是双向的,因为后文内容可能会影响到前文
更多资料:
一、LSTM网络
1.1 单层LTSM网络
Single Layer LSTM
try:
# %tensorflow_version only exists in Colab.
%tensorflow_version 2.x
except Exception:
pass
import tensorflow as tf
import tensorflow_datasets as tfds
# Get the data
dataset, info = tfds.load('imdb_reviews/subwords8k', with_info=True, as_supervised=True)
# You can use a smaller version of the datasets to speed things up
# For example, here we use the first 10% of the training data
# and the first 10% of the test data to speed things up
# When I used 10%, I was able to train on a CPU at about 65 seconds per epoch
dataset, info = tfds.load('imdb_reviews/subwords8k', with_info=True, as_supervised=True)
train_dataset, test_dataset = dataset['train'].take(4000), dataset['test'].take(1000)
构建分析器:
tokenizer = info.features['text'].encoder
# Can explore different buffer and batch sizes to make training
# faster also
BUFFER_SIZE = 1000
BATCH_SIZE = 64
train_dataset = train_dataset.shuffle(BUFFER_SIZE)
train_dataset = train_dataset.padded_batch(BATCH_SIZE)
test_dataset = test_dataset.padded_batch(BATCH_SIZE)
构建单层LSTM网络:
model = tf.keras.Sequential([
tf.keras.layers.Embedding(tokenizer.vocab_size, 64), #嵌入层维度为64
#Bidirectional记忆两个方向的上下文信息
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)), #增加第二层为LSTM层;64代表单向的LSTM层的输出维度;实际输出为128维
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, None, 64) 523840
_________________________________________________________________
bidirectional (Bidirectional (None, 128) 66048
_________________________________________________________________
dense (Dense) (None, 64) 8256
_________________________________________________________________
dense_1 (Dense) (None, 1) 65
=================================================================
Total params: 598,209
Trainable params: 598,209
Non-trainable params: 0
_________________________________________________________________
网络训练:
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Can change number of epochs to make training faster
NUM_EPOCHS = 50
history = model.fit(train_dataset, epochs=NUM_EPOCHS, validation_data=test_dataset)
Epoch 1/50
63/63 [==============================] - 22s 249ms/step - loss: 0.6926 - accuracy: 0.5080 - val_loss: 0.6739 - val_accuracy: 0.5970
Epoch 2/50
...
Epoch 50/50
63/63 [==============================] - 14s 230ms/step - loss: 1.7393e-06 - accuracy: 1.0000 - val_loss: 2.8910 - val_accuracy: 0.7360
查看网络性能:
import matplotlib.pyplot as plt
def plot_graphs(history, string):
plt.plot(history.history[string])
plt.plot(history.history['val_'+string])
plt.xlabel("Epochs")
plt.ylabel(string)
plt.legend([string, 'val_'+string])
plt.show()
plot_graphs(history, 'accuracy')
output_16_0.png
plot_graphs(history, 'loss')
可见单层LSTM收敛速度很快,但测试集的精确度提升不大和loss一直在上升
资源释放:
import os, signal
os.kill(os.getpid(), signal.SIGINT)
下面看看多层的LTSM的表现:
1.2 多层LTSM网络
Multiple Layer LSTM
from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow_datasets as tfds
import tensorflow as tf
print(tf.__version__)
2.4.0
数据加载:
# Get the data
# dataset, info = tfds.load('imdb_reviews/subwords8k', with_info=True, as_supervised=True)
# train_dataset, test_dataset = dataset['train'], dataset['test']
dataset, info = tfds.load('imdb_reviews/subwords8k', with_info=True, as_supervised=True)
train_dataset, test_dataset = dataset['train'].take(4000), dataset['test'].take(1000)
从数据集中导出分词器:
tokenizer = info.features['text'].encoder
print(info)
tfds.core.DatasetInfo(
name='imdb_reviews',
full_name='imdb_reviews/subwords8k/1.0.0',
description="""
Large Movie Review Dataset.
This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well.
""",
config_description="""
Uses `tfds.deprecated.text.SubwordTextEncoder` with 8k vocab size
""",
homepage='http://ai.stanford.edu/~amaas/data/sentiment/',
data_path='C:\\Users\\Robin\\tensorflow_datasets\\imdb_reviews\\subwords8k\\1.0.0',
download_size=80.23 MiB,
dataset_size=54.72 MiB,
features=FeaturesDict({
'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=2),
'text': Text(shape=(None,), dtype=tf.int64, encoder=<SubwordTextEncoder vocab_size=8185>),
}),
supervised_keys=('text', 'label'),
splits={
'test': <SplitInfo num_examples=25000, num_shards=1>,
'train': <SplitInfo num_examples=25000, num_shards=1>,
'unsupervised': <SplitInfo num_examples=50000, num_shards=1>,
},
citation="""@InProceedings{maas-EtAl:2011:ACL-HLT2011,
author = {Maas, Andrew L. and Daly, Raymond E. and Pham, Peter T. and Huang, Dan and Ng, Andrew Y. and Potts, Christopher},
title = {Learning Word Vectors for Sentiment Analysis},
booktitle = {Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies},
month = {June},
year = {2011},
address = {Portland, Oregon, USA},
publisher = {Association for Computational Linguistics},
pages = {142--150},
url = {http://www.aclweb.org/anthology/P11-1015}
}""",
)
选择训练数据集和测试数据集
BUFFER_SIZE = 100
BATCH_SIZE = 100
train_dataset = train_dataset.shuffle(BUFFER_SIZE).take(1000)
train_dataset = train_dataset.padded_batch(BATCH_SIZE)
test_dataset = test_dataset.padded_batch(BATCH_SIZE).take(1000)
分词器字典大小:
tokenizer.vocab_size
8185
构建网络:
vocab_size = 1000 #这行赋值代码好像没什么用
model = tf.keras.Sequential([
tf.keras.layers.Embedding(tokenizer.vocab_size, 8),
#当前一个LSTM层和后一个LSTM层衔接时,需要设置 return_sequences=True ,确保上一个LSTM层输出与下一个LSTM层输入相匹配
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(8, return_sequences=True)), #实现多层的LSTM
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(8)),
tf.keras.layers.Dense(16, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, None, 8) 65480
_________________________________________________________________
bidirectional (Bidirectional (None, None, 16) 1088
_________________________________________________________________
bidirectional_1 (Bidirection (None, 16) 1600
_________________________________________________________________
dense (Dense) (None, 16) 272
_________________________________________________________________
dense_1 (Dense) (None, 1) 17
=================================================================
Total params: 68,457
Trainable params: 68,457
Non-trainable params: 0
_________________________________________________________________
模型编译和训练
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
NUM_EPOCHS = 50
history = model.fit(train_dataset, epochs=NUM_EPOCHS, validation_data=test_dataset)
Epoch 1/50
10/10 [==============================] - 16s 667ms/step - loss: 0.6932 - accuracy: 0.5180 - val_loss: 0.6932 - val_accuracy: 0.4970
...
Epoch 50/50
10/10 [==============================] - 4s 407ms/step - loss: 0.0011 - accuracy: 1.0000 - val_loss: 1.6064 - val_accuracy: 0.7030
查看网络性能:
import matplotlib.pyplot as plt
def plot_graphs(history, string):
plt.plot(history.history[string])
plt.plot(history.history['val_'+string])
plt.xlabel("Epochs")
plt.ylabel(string)
plt.legend([string, 'val_'+string])
plt.show()
plot_graphs(history, 'accuracy')
output_13_0.png
plot_graphs(history, 'loss')
output_14_0.png
课程中介绍说:
双层LSTM的训练准确度曲线更平缓,同时验证准确度曲线更好
单层LSTM网络模型的训练准确度虽然总体在上升,但是在某些地方出现急剧的下降,说明算法的鲁棒性不高;
双层LSTM网络模型的训练准确度曲线则非常平滑,说明训练的过程更加稳定
但是我这里测试实际上是反过来了。。。可能与数据有关。。。
还有不同的循环神经网络:
- 含卷积层的RNN
- 门控循环单元GRU
文本分类更难的原因:
-
因为网络结构非常简单,容易出现过拟合,所以可以通过调整网络的结构和参数,来改善网络的性能
-
相较于图像处理,过拟合更容易发生在文本处理中,因为验证数据集中总是存在未登录词,而这些词难以分类,因此导致过拟合
import os, signal
os.kill(os.getpid(), signal.SIGINT)
二、卷积网络
import json
import tensorflow as tf
import numpy as np
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
数据下载
# !wget --no-check-certificate \
# https://storage.googleapis.com/laurencemoroney-blog.appspot.com/sarcasm.json \
# -O /tmp/sarcasm.json
设置超参数
vocab_size = 1000
embedding_dim = 16 #嵌入维度为16
max_length = 120
trunc_type='post'
padding_type='post'
oov_tok = "<OOV>"
training_size = 20000
数据预处理,构建训练和测试数据集:
with open("sarcasm.json", 'r') as f:
datastore = json.load(f)
sentences = []
labels = []
urls = []
for item in datastore:
sentences.append(item['headline'])
labels.append(item['is_sarcastic'])
training_sentences = sentences[0:training_size]
testing_sentences = sentences[training_size:]
training_labels = labels[0:training_size]
testing_labels = labels[training_size:]
构建分析器并且将文本序列化:
tokenizer = Tokenizer(num_words=vocab_size, oov_token=oov_tok)
tokenizer.fit_on_texts(training_sentences)
word_index = tokenizer.word_index
training_sequences = tokenizer.texts_to_sequences(training_sentences)
training_padded = pad_sequences(training_sequences, maxlen=max_length, padding=padding_type, truncating=trunc_type)
testing_sequences = tokenizer.texts_to_sequences(testing_sentences)
testing_padded = pad_sequences(testing_sequences, maxlen=max_length, padding=padding_type, truncating=trunc_type)
构建文本嵌入卷积网络:
通过卷积层,输入文本向量将通过128个大小为5的卷积核来提取特征,并通过学习来调整卷积核的参数,以获得期望的结果
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
#通过卷积层,输入文本向量将通过大小为5的卷积核来提取特征,并通过学习来调整卷积核的参数,以获得期望的结果
tf.keras.layers.Conv1D(128, 5, activation='relu'), #卷积核数目为128,大小为5,激活函数为relu
tf.keras.layers.GlobalMaxPooling1D(),
tf.keras.layers.Dense(24, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
因为输入的是包含120个单词的序列,一个5个单词长的卷积核,会从序列的前面核后面各削去2个单词,只剩下116个单词
#因为输入的是包含120个单词的序列,一个5个单词长的卷积核,会从序列的前面核后面各削去2个单词,只剩下116个单词
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 120, 16) 16000
_________________________________________________________________
conv1d (Conv1D) (None, 116, 128) 10368
_________________________________________________________________
global_max_pooling1d (Global (None, 128) 0
_________________________________________________________________
dense (Dense) (None, 24) 3096
_________________________________________________________________
dense_1 (Dense) (None, 1) 25
=================================================================
Total params: 29,489
Trainable params: 29,489
Non-trainable params: 0
_________________________________________________________________
模型训练50轮:
num_epochs = 50
training_padded = np.array(training_padded)
training_labels = np.array(training_labels)
testing_padded = np.array(testing_padded)
testing_labels = np.array(testing_labels)
history = model.fit(training_padded, training_labels, epochs=num_epochs, validation_data=(testing_padded, testing_labels), verbose=1)
Epoch 1/50
625/625 [==============================] - 10s 10ms/step - loss: 0.5596 - accuracy: 0.6902 - val_loss: 0.4028 - val_accuracy: 0.8137
...
Epoch 50/50
625/625 [==============================] - 6s 9ms/step - loss: 0.0217 - accuracy: 0.9907 - val_loss: 2.6126 - val_accuracy: 0.7757
查看模型性能
import matplotlib.pyplot as plt
def plot_graphs(history, string):
plt.plot(history.history[string])
plt.plot(history.history['val_'+string])
plt.xlabel("Epochs")
plt.ylabel(string)
plt.legend([string, 'val_'+string])
plt.show()
plot_graphs(history, 'accuracy')
plot_graphs(history, 'loss')
output_8_0.png
output_8_1.png
性能好像也不是很好
保存模型以便后面可用:
model.save("test.h5")
资源释放:
import os, signal
os.kill(os.getpid(), signal.SIGINT)
三、GRU网络
# Multiple Layer GRU
from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow_datasets as tfds
import tensorflow as tf
print(tf.__version__)
2.4.0
TF 2.x不需要运行以下语句:
# If the tf.__version__ is 1.x, please run this cell
# !pip install tensorflow==2.0.0-beta0
数据获取:
# Get the data
dataset, info = tfds.load('imdb_reviews/subwords8k', with_info=True, as_supervised=True)
train_dataset, test_dataset = dataset['train'], dataset['test']
从数据集中导出已经构建好的分词器:
tokenizer = info.features['text'].encoder
设置训练和测试数据集
BUFFER_SIZE = 10000
BATCH_SIZE = 64
train_dataset = train_dataset.shuffle(BUFFER_SIZE)
train_dataset = train_dataset.padded_batch(BATCH_SIZE, tf.compat.v1.data.get_output_shapes(train_dataset))
test_dataset = test_dataset.padded_batch(BATCH_SIZE, tf.compat.v1.data.get_output_shapes(test_dataset))
构建GRU模型
model = tf.keras.Sequential([
tf.keras.layers.Embedding(tokenizer.vocab_size, 64), #嵌入的维度为64维
tf.keras.layers.Bidirectional(tf.keras.layers.GRU(128)),
tf.keras.layers.Dense(6, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
查看网络结构
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, None, 64) 523840
_________________________________________________________________
bidirectional (Bidirectional (None, 256) 148992
_________________________________________________________________
dense (Dense) (None, 6) 1542
_________________________________________________________________
dense_1 (Dense) (None, 1) 7
=================================================================
Total params: 674,381
Trainable params: 674,381
Non-trainable params: 0
_________________________________________________________________
编译并训练模型:
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
NUM_EPOCHS = 50
history = model.fit(train_dataset, epochs=NUM_EPOCHS, validation_data=test_dataset)
Epoch 1/50
391/391 [==============================] - 166s 392ms/step - loss: 0.6686 - accuracy: 0.5515 - val_loss: 0.7990 - val_accuracy: 0.5909
...
Epoch 50/50
391/391 [==============================] - 145s 371ms/step - loss: 1.2944e-06 - accuracy: 1.0000 - val_loss: 1.5420 - val_accuracy: 0.8560
查看模型性能:
import matplotlib.pyplot as plt
def plot_graphs(history, string):
plt.plot(history.history[string])
plt.plot(history.history['val_'+string])
plt.xlabel("Epochs")
plt.ylabel(string)
plt.legend([string, 'val_'+string])
plt.show()
plot_graphs(history, 'accuracy')
plot_graphs(history, 'loss')
output_12_0.png
output_13_0.png
资源释放
import os, signal
os.kill(os.getpid(), signal.SIGINT)
四、多种模型的结合
import json
import tensorflow as tf
import csv
import random
import numpy as np
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.utils import to_categorical
from tensorflow.keras import regularizers
embedding_dim = 100
max_length = 16
trunc_type='post'
padding_type='post'
oov_tok = "<OOV>"
training_size=160000
test_portion=.1
corpus = []
(省略)
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size+1, embedding_dim, input_length=max_length, weights=[embeddings_matrix], trainable=False),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Conv1D(64, 5, activation='relu'),
tf.keras.layers.MaxPooling1D(pool_size=4),
tf.keras.layers.LSTM(64),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
model.summary()
num_epochs = 50
training_sequences = np.array(training_sequences)
training_labels = np.array(training_labels)
test_sequences = np.array(test_sequences)
test_labels = np.array(test_labels)
history = model.fit(training_sequences, training_labels, epochs=num_epochs, validation_data=(test_sequences, test_labels), verbose=2)
print("Training Complete")
五、多种模型的性能比较
# NOTE: PLEASE MAKE SURE YOU ARE RUNNING THIS IN A PYTHON3 ENVIRONMENT
import tensorflow as tf
print(tf.__version__)
# This is needed for the iterator over the data
# But not necessary if you have TF 2.0 installed
#!pip install tensorflow==2.0.0-beta0
# tf.enable_eager_execution()
# !pip install -q tensorflow-datasets
2.4.0
加载数据集
import tensorflow_datasets as tfds
imdb, info = tfds.load("imdb_reviews", with_info=True, as_supervised=True)
准备训练数据和测试数据
import numpy as np
# train_data, test_data = imdb['train'], imdb['test']
train_data, test_data = imdb['train'].take(4000), imdb['test'].take(1000)
training_sentences = []
training_labels = []
testing_sentences = []
testing_labels = []
# str(s.tonumpy()) is needed in Python3 instead of just s.numpy()
for s,l in train_data:
training_sentences.append(str(s.numpy()))
training_labels.append(l.numpy())
for s,l in test_data:
testing_sentences.append(str(s.numpy()))
testing_labels.append(l.numpy())
training_labels_final = np.array(training_labels)
testing_labels_final = np.array(testing_labels)
超参数设定和文本序列化
vocab_size = 10000
embedding_dim = 16
max_length = 120
trunc_type='post'
oov_tok = "<OOV>"
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
tokenizer = Tokenizer(num_words = vocab_size, oov_token=oov_tok)
tokenizer.fit_on_texts(training_sentences)
word_index = tokenizer.word_index
sequences = tokenizer.texts_to_sequences(training_sentences)
padded = pad_sequences(sequences,maxlen=max_length, truncating=trunc_type)
testing_sequences = tokenizer.texts_to_sequences(testing_sentences)
testing_padded = pad_sequences(testing_sequences,maxlen=max_length)
构建解码器查看输出编码转文本内容
reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])
def decode_review(text):
return ' '.join([reverse_word_index.get(i, '?') for i in text])
print(decode_review(padded[1]))
print(training_sentences[1])
? ? ? ? ? ? ? b'i have been known to fall asleep during films but this is usually due to a combination of things including really tired being warm and comfortable on the <OOV> and having just eaten a lot however on this occasion i fell asleep because the film was rubbish the plot development was constant constantly slow and boring things seemed to happen but with no explanation of what was causing them or why i admit i may have missed part of the film but i watched the majority of it and everything just seemed to happen of its own <OOV> without any real concern for anything else i cant recommend this film at all '
b'I have been known to fall asleep during films, but this is usually due to a combination of things including, really tired, being warm and comfortable on the sette and having just eaten a lot. However on this occasion I fell asleep because the film was rubbish. The plot development was constant. Constantly slow and boring. Things seemed to happen, but with no explanation of what was causing them or why. I admit, I may have missed part of the film, but i watched the majority of it and everything just seemed to happen of its own accord without any real concern for anything else. I cant recommend this film at all.'
构建单层GRU模型并查看模型结构
# Model Definition with GRU
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
tf.keras.layers.Bidirectional(tf.keras.layers.GRU(32)),
tf.keras.layers.Dense(6, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 120, 16) 160000
_________________________________________________________________
bidirectional (Bidirectional (None, 64) 9600
_________________________________________________________________
dense (Dense) (None, 6) 390
_________________________________________________________________
dense_1 (Dense) (None, 1) 7
=================================================================
Total params: 169,997
Trainable params: 169,997
Non-trainable params: 0
_________________________________________________________________
模型训练:
num_epochs = 50
history = model.fit(padded, training_labels_final, epochs=num_epochs, validation_data=(testing_padded, testing_labels_final))
Epoch 1/50
125/125 [==============================] - 16s 53ms/step - loss: 0.6933 - accuracy: 0.4849 - val_loss: 0.6931 - val_accuracy: 0.4970
...
Epoch 50/50
125/125 [==============================] - 4s 28ms/step - loss: 3.5880e-06 - accuracy: 1.0000 - val_loss: 1.9096 - val_accuracy: 0.7550
模型训练性能
import matplotlib.pyplot as plt
def plot_graphs(history, string):
plt.plot(history.history[string])
plt.plot(history.history['val_'+string])
plt.xlabel("Epochs")
plt.ylabel(string)
plt.legend([string, 'val_'+string])
plt.show()
plot_graphs(history, 'accuracy')
plot_graphs(history, 'loss')
output_7_0.png
output_7_1.png
模型收敛速度很快,测试训练精确度不能再往上升,loss随轮数增加,模型过拟合。
构建单层LSTM模型
# Model Definition with single LSTM
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32)),
tf.keras.layers.Dense(6, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
model.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_1 (Embedding) (None, 120, 16) 160000
_________________________________________________________________
bidirectional_1 (Bidirection (None, 64) 12544
_________________________________________________________________
dense_2 (Dense) (None, 6) 390
_________________________________________________________________
dense_3 (Dense) (None, 1) 7
=================================================================
Total params: 172,941
Trainable params: 172,941
Non-trainable params: 0
_________________________________________________________________
模型训练
num_epochs = 50
history = model.fit(padded, training_labels_final, epochs=num_epochs, validation_data=(testing_padded, testing_labels_final))
Epoch 1/50
125/125 [==============================] - 14s 49ms/step - loss: 0.6927 - accuracy: 0.5054 - val_loss: 0.6128 - val_accuracy: 0.7090
...
Epoch 50/50
125/125 [==============================] - 3s 27ms/step - loss: 4.3843e-05 - accuracy: 1.0000 - val_loss: 1.8278 - val_accuracy: 0.7590
模型性能
plot_graphs(history, 'accuracy')
plot_graphs(history, 'loss')
output_10_0.png
output_10_1.png
好像模型鲁棒性更加差了。。。。
构建多层LSTM模型
# Model Definition with multiple LSTM
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32, return_sequences=True)),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32)),
tf.keras.layers.Dense(6, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
model.summary()
Model: "sequential_4"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_7 (Embedding) (None, 120, 16) 160000
_________________________________________________________________
bidirectional_2 (Bidirection (None, 120, 64) 12544
_________________________________________________________________
bidirectional_3 (Bidirection (None, 64) 24832
_________________________________________________________________
dense_8 (Dense) (None, 6) 390
_________________________________________________________________
dense_9 (Dense) (None, 1) 7
=================================================================
Total params: 197,773
Trainable params: 197,773
Non-trainable params: 0
_________________________________________________________________
模型训练
num_epochs = 50
history = model.fit(padded, training_labels_final, epochs=num_epochs, validation_data=(testing_padded, testing_labels_final))
Epoch 1/50
125/125 [==============================] - 24s 81ms/step - loss: 0.6917 - accuracy: 0.5152 - val_loss: 0.6868 - val_accuracy: 0.5030
...
Epoch 50/50
125/125 [==============================] - 5s 43ms/step - loss: 3.4054e-06 - accuracy: 1.0000 - val_loss: 2.3198 - val_accuracy: 0.7740
模型性能:
```python
plot_graphs(history, 'accuracy')
plot_graphs(history, 'loss')
output_13_0.png
output_13_1.png
构建单层卷积模型
# Model Definition with Conv1D
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
tf.keras.layers.Conv1D(128, 5, activation='relu'),
tf.keras.layers.GlobalAveragePooling1D(),
tf.keras.layers.Dense(6, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
model.summary()
Model: "sequential_3"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_3 (Embedding) (None, 120, 16) 160000
_________________________________________________________________
conv1d_1 (Conv1D) (None, 116, 128) 10368
_________________________________________________________________
global_average_pooling1d_1 ( (None, 128) 0
_________________________________________________________________
dense_6 (Dense) (None, 6) 774
_________________________________________________________________
dense_7 (Dense) (None, 1) 7
=================================================================
Total params: 171,149
Trainable params: 171,149
Non-trainable params: 0
_________________________________________________________________
模型训练
num_epochs = 50
history = model.fit(padded, training_labels_final, epochs=num_epochs, validation_data=(testing_padded, testing_labels_final))
Epoch 1/50
125/125 [==============================] - 6s 19ms/step - loss: 0.6895 - accuracy: 0.5145 - val_loss: 0.6109 - val_accuracy: 0.7300
...
Epoch 50/50
125/125 [==============================] - 1s 11ms/step - loss: 1.5977e-05 - accuracy: 1.0000 - val_loss: 1.4861 - val_accuracy: 0.7840
plot_graphs(history, 'accuracy')
plot_graphs(history, 'loss')
output_16_0.png
output_16_1.png
资源释放:
import os, signal
os.kill(os.getpid(), signal.SIGINT)
网友评论