美文网首页
【深度学习TensorFlow (12)】LSTM、卷积、GRU

【深度学习TensorFlow (12)】LSTM、卷积、GRU

作者: Geekero | 来源:发表于2021-02-28 11:47 被阅读0次

    学习自中国大学MOOC TensorFlow学习课程

    一、循环神经网络RNN的结构

    神经网络是一个特殊的模型


    通过数据和标签推导出规则,像函数:


    但没有考虑输入数据之间的相互联系

    所以当单词被分为子词时,上下文难以理解单子子词的含义。因此子词出现顺序,对理解单词含义非常重要

    斐波拉契数列:

    对应RNN网络:


    单个循环神经网络的神经元:


    多个神经元的组合:


    更多RNN资料:


    但这个模型有一个问题:就是没办法理解上下文含义,例如:


    于是一种更先进的循环神经网络结构——LSTM被提出,用来分析文本的上下文含义:


    除了前面RNN的标准的序列信息以外,还加入了cell state结构来实现长期记忆,来解决这个问题:

    cell state 记忆可以是双向的,因为后文内容可能会影响到前文


    更多资料:


    一、LSTM网络

    1.1 单层LTSM网络

    Single Layer LSTM

    try:
      # %tensorflow_version only exists in Colab.
      %tensorflow_version 2.x
    except Exception:
      pass
    
    import tensorflow as tf
    import tensorflow_datasets as tfds
    
    # Get the data
    dataset, info = tfds.load('imdb_reviews/subwords8k', with_info=True, as_supervised=True)
    
    
    # You can use a smaller version of the datasets to speed things up
    # For example, here we use the first 10% of the training data
    # and the first 10% of the test data to speed things up
    # When I used 10%, I was able to train on a CPU at about 65 seconds per epoch
    dataset, info = tfds.load('imdb_reviews/subwords8k', with_info=True, as_supervised=True)
    train_dataset, test_dataset = dataset['train'].take(4000), dataset['test'].take(1000)
    

    构建分析器:

    tokenizer = info.features['text'].encoder
    
    # Can explore different buffer and batch sizes to make training
    # faster also
    BUFFER_SIZE = 1000
    BATCH_SIZE = 64
    
    train_dataset = train_dataset.shuffle(BUFFER_SIZE)
    train_dataset = train_dataset.padded_batch(BATCH_SIZE)
    test_dataset = test_dataset.padded_batch(BATCH_SIZE)
    

    构建单层LSTM网络:

    model = tf.keras.Sequential([
        tf.keras.layers.Embedding(tokenizer.vocab_size, 64), #嵌入层维度为64
        #Bidirectional记忆两个方向的上下文信息
        tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)),  #增加第二层为LSTM层;64代表单向的LSTM层的输出维度;实际输出为128维
        tf.keras.layers.Dense(64, activation='relu'),
        tf.keras.layers.Dense(1, activation='sigmoid')
    ])
    
    model.summary()
    
    
        Model: "sequential"
        _________________________________________________________________
        Layer (type)                 Output Shape              Param #   
        =================================================================
        embedding (Embedding)        (None, None, 64)          523840    
        _________________________________________________________________
        bidirectional (Bidirectional (None, 128)               66048     
        _________________________________________________________________
        dense (Dense)                (None, 64)                8256      
        _________________________________________________________________
        dense_1 (Dense)              (None, 1)                 65        
        =================================================================
        Total params: 598,209
        Trainable params: 598,209
        Non-trainable params: 0
        _________________________________________________________________
    

    网络训练:

    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    
    # Can change number of epochs to make training faster
    NUM_EPOCHS = 50
    history = model.fit(train_dataset, epochs=NUM_EPOCHS, validation_data=test_dataset)
    
        Epoch 1/50
        63/63 [==============================] - 22s 249ms/step - loss: 0.6926 - accuracy: 0.5080 - val_loss: 0.6739 - val_accuracy: 0.5970
        Epoch 2/50
         ...
        Epoch 50/50
        63/63 [==============================] - 14s 230ms/step - loss: 1.7393e-06 - accuracy: 1.0000 - val_loss: 2.8910 - val_accuracy: 0.7360
    

    查看网络性能:

    import matplotlib.pyplot as plt
    
    def plot_graphs(history, string):
      plt.plot(history.history[string])
      plt.plot(history.history['val_'+string])
      plt.xlabel("Epochs")
      plt.ylabel(string)
      plt.legend([string, 'val_'+string])
      plt.show()
    
    plot_graphs(history, 'accuracy')
    
    output_16_0.png
    plot_graphs(history, 'loss')
    

    可见单层LSTM收敛速度很快,但测试集的精确度提升不大和loss一直在上升

    资源释放:

    import os, signal
    
    os.kill(os.getpid(), signal.SIGINT)
    

    下面看看多层的LTSM的表现:

    1.2 多层LTSM网络

    Multiple Layer LSTM

    from __future__ import absolute_import, division, print_function, unicode_literals
    
    
    import tensorflow_datasets as tfds
    import tensorflow as tf
    print(tf.__version__)
    
        2.4.0
    

    数据加载:

    # Get the data
    # dataset, info = tfds.load('imdb_reviews/subwords8k', with_info=True, as_supervised=True)
    # train_dataset, test_dataset = dataset['train'], dataset['test']
    dataset, info = tfds.load('imdb_reviews/subwords8k', with_info=True, as_supervised=True)
    train_dataset, test_dataset = dataset['train'].take(4000), dataset['test'].take(1000)
    

    从数据集中导出分词器:

    tokenizer = info.features['text'].encoder
    
    print(info)
    
        tfds.core.DatasetInfo(
            name='imdb_reviews',
            full_name='imdb_reviews/subwords8k/1.0.0',
            description="""
            Large Movie Review Dataset.
            This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well.
            """,
            config_description="""
            Uses `tfds.deprecated.text.SubwordTextEncoder` with 8k vocab size
            """,
            homepage='http://ai.stanford.edu/~amaas/data/sentiment/',
            data_path='C:\\Users\\Robin\\tensorflow_datasets\\imdb_reviews\\subwords8k\\1.0.0',
            download_size=80.23 MiB,
            dataset_size=54.72 MiB,
            features=FeaturesDict({
                'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=2),
                'text': Text(shape=(None,), dtype=tf.int64, encoder=<SubwordTextEncoder vocab_size=8185>),
            }),
            supervised_keys=('text', 'label'),
            splits={
                'test': <SplitInfo num_examples=25000, num_shards=1>,
                'train': <SplitInfo num_examples=25000, num_shards=1>,
                'unsupervised': <SplitInfo num_examples=50000, num_shards=1>,
            },
            citation="""@InProceedings{maas-EtAl:2011:ACL-HLT2011,
              author    = {Maas, Andrew L.  and  Daly, Raymond E.  and  Pham, Peter T.  and  Huang, Dan  and  Ng, Andrew Y.  and  Potts, Christopher},
              title     = {Learning Word Vectors for Sentiment Analysis},
              booktitle = {Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies},
              month     = {June},
              year      = {2011},
              address   = {Portland, Oregon, USA},
              publisher = {Association for Computational Linguistics},
              pages     = {142--150},
              url       = {http://www.aclweb.org/anthology/P11-1015}
            }""",
        )
    

    选择训练数据集和测试数据集

    BUFFER_SIZE = 100
    BATCH_SIZE = 100
    
    train_dataset = train_dataset.shuffle(BUFFER_SIZE).take(1000)
    train_dataset = train_dataset.padded_batch(BATCH_SIZE)
    test_dataset = test_dataset.padded_batch(BATCH_SIZE).take(1000)
    

    分词器字典大小:

    tokenizer.vocab_size
        8185
    

    构建网络:

    vocab_size = 1000 #这行赋值代码好像没什么用
    model = tf.keras.Sequential([
        tf.keras.layers.Embedding(tokenizer.vocab_size, 8),
        #当前一个LSTM层和后一个LSTM层衔接时,需要设置 return_sequences=True ,确保上一个LSTM层输出与下一个LSTM层输入相匹配
        tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(8, return_sequences=True)),  #实现多层的LSTM
        tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(8)),
        tf.keras.layers.Dense(16, activation='relu'),
        tf.keras.layers.Dense(1, activation='sigmoid')
    ])
    
    
    model.summary()
    
        Model: "sequential"
        _________________________________________________________________
        Layer (type)                 Output Shape              Param #   
        =================================================================
        embedding (Embedding)        (None, None, 8)           65480     
        _________________________________________________________________
        bidirectional (Bidirectional (None, None, 16)          1088      
        _________________________________________________________________
        bidirectional_1 (Bidirection (None, 16)                1600      
        _________________________________________________________________
        dense (Dense)                (None, 16)                272       
        _________________________________________________________________
        dense_1 (Dense)              (None, 1)                 17        
        =================================================================
        Total params: 68,457
        Trainable params: 68,457
        Non-trainable params: 0
        _________________________________________________________________  
    

    模型编译和训练

    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    
    NUM_EPOCHS = 50
    history = model.fit(train_dataset, epochs=NUM_EPOCHS, validation_data=test_dataset)
    
        Epoch 1/50
        10/10 [==============================] - 16s 667ms/step - loss: 0.6932 - accuracy: 0.5180 - val_loss: 0.6932 - val_accuracy: 0.4970
        ...
        Epoch 50/50
        10/10 [==============================] - 4s 407ms/step - loss: 0.0011 - accuracy: 1.0000 - val_loss: 1.6064 - val_accuracy: 0.7030
    

    查看网络性能:

    import matplotlib.pyplot as plt
    
    
    def plot_graphs(history, string):
      plt.plot(history.history[string])
      plt.plot(history.history['val_'+string])
      plt.xlabel("Epochs")
      plt.ylabel(string)
      plt.legend([string, 'val_'+string])
      plt.show()
    
    plot_graphs(history, 'accuracy')
    
    output_13_0.png
    plot_graphs(history, 'loss')
    
    output_14_0.png

    课程中介绍说:

    • 双层LSTM的训练准确度曲线更平缓,同时验证准确度曲线更好

    • 单层LSTM网络模型的训练准确度虽然总体在上升,但是在某些地方出现急剧的下降,说明算法的鲁棒性不高;

    • 双层LSTM网络模型的训练准确度曲线则非常平滑,说明训练的过程更加稳定

    但是我这里测试实际上是反过来了。。。可能与数据有关。。。

    还有不同的循环神经网络:

    • 含卷积层的RNN
    • 门控循环单元GRU

    文本分类更难的原因:

    1. 因为网络结构非常简单,容易出现过拟合,所以可以通过调整网络的结构和参数,来改善网络的性能

    2. 相较于图像处理,过拟合更容易发生在文本处理中,因为验证数据集中总是存在未登录词,而这些词难以分类,因此导致过拟合

    import os, signal
    
    os.kill(os.getpid(), signal.SIGINT)
    

    二、卷积网络

    import json
    import tensorflow as tf
    import numpy as np
    
    from tensorflow.keras.preprocessing.text import Tokenizer
    from tensorflow.keras.preprocessing.sequence import pad_sequences
    

    数据下载

    # !wget --no-check-certificate \
    #     https://storage.googleapis.com/laurencemoroney-blog.appspot.com/sarcasm.json \
    #     -O /tmp/sarcasm.json
    

    设置超参数

    vocab_size = 1000
    embedding_dim = 16 #嵌入维度为16
    max_length = 120
    trunc_type='post'
    padding_type='post'
    oov_tok = "<OOV>"
    training_size = 20000
    

    数据预处理,构建训练和测试数据集:

    with open("sarcasm.json", 'r') as f:
        datastore = json.load(f)
    
    
    sentences = []
    labels = []
    urls = []
    for item in datastore:
        sentences.append(item['headline'])
        labels.append(item['is_sarcastic'])
    
    training_sentences = sentences[0:training_size]
    testing_sentences = sentences[training_size:]
    training_labels = labels[0:training_size]
    testing_labels = labels[training_size:]
    

    构建分析器并且将文本序列化:

    tokenizer = Tokenizer(num_words=vocab_size, oov_token=oov_tok)
    tokenizer.fit_on_texts(training_sentences)
    
    word_index = tokenizer.word_index
    
    training_sequences = tokenizer.texts_to_sequences(training_sentences)
    training_padded = pad_sequences(training_sequences, maxlen=max_length, padding=padding_type, truncating=trunc_type)
    
    testing_sequences = tokenizer.texts_to_sequences(testing_sentences)
    testing_padded = pad_sequences(testing_sequences, maxlen=max_length, padding=padding_type, truncating=trunc_type)
    

    构建文本嵌入卷积网络:

    通过卷积层,输入文本向量将通过128个大小为5的卷积核来提取特征,并通过学习来调整卷积核的参数,以获得期望的结果

    model = tf.keras.Sequential([
        tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
        #通过卷积层,输入文本向量将通过大小为5的卷积核来提取特征,并通过学习来调整卷积核的参数,以获得期望的结果
        tf.keras.layers.Conv1D(128, 5, activation='relu'),  #卷积核数目为128,大小为5,激活函数为relu
        tf.keras.layers.GlobalMaxPooling1D(),
        tf.keras.layers.Dense(24, activation='relu'),
        tf.keras.layers.Dense(1, activation='sigmoid')
    ])
    model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
    

    因为输入的是包含120个单词的序列,一个5个单词长的卷积核,会从序列的前面核后面各削去2个单词,只剩下116个单词

    #因为输入的是包含120个单词的序列,一个5个单词长的卷积核,会从序列的前面核后面各削去2个单词,只剩下116个单词
    model.summary() 
    
        Model: "sequential"
        _________________________________________________________________
        Layer (type)                 Output Shape              Param #   
        =================================================================
        embedding (Embedding)        (None, 120, 16)           16000     
        _________________________________________________________________
        conv1d (Conv1D)              (None, 116, 128)          10368     
        _________________________________________________________________
        global_max_pooling1d (Global (None, 128)               0         
        _________________________________________________________________
        dense (Dense)                (None, 24)                3096      
        _________________________________________________________________
        dense_1 (Dense)              (None, 1)                 25        
        =================================================================
        Total params: 29,489
        Trainable params: 29,489
        Non-trainable params: 0
        _________________________________________________________________
    

    模型训练50轮:

    num_epochs = 50
    training_padded = np.array(training_padded)
    training_labels = np.array(training_labels)
    testing_padded = np.array(testing_padded)
    testing_labels = np.array(testing_labels)
    
    history = model.fit(training_padded, training_labels, epochs=num_epochs, validation_data=(testing_padded, testing_labels), verbose=1)
    
        Epoch 1/50
        625/625 [==============================] - 10s 10ms/step - loss: 0.5596 - accuracy: 0.6902 - val_loss: 0.4028 - val_accuracy: 0.8137
        ...
        Epoch 50/50
        625/625 [==============================] - 6s 9ms/step - loss: 0.0217 - accuracy: 0.9907 - val_loss: 2.6126 - val_accuracy: 0.7757
    

    查看模型性能

    import matplotlib.pyplot as plt
    
    
    def plot_graphs(history, string):
      plt.plot(history.history[string])
      plt.plot(history.history['val_'+string])
      plt.xlabel("Epochs")
      plt.ylabel(string)
      plt.legend([string, 'val_'+string])
      plt.show()
    
    plot_graphs(history, 'accuracy')
    plot_graphs(history, 'loss')
    
    output_8_0.png output_8_1.png

    性能好像也不是很好

    保存模型以便后面可用:

    model.save("test.h5")
    

    资源释放:

    import os, signal
    
    os.kill(os.getpid(), signal.SIGINT)
    

    三、GRU网络

    # Multiple Layer GRU
    from __future__ import absolute_import, division, print_function, unicode_literals
    
    
    import tensorflow_datasets as tfds
    import tensorflow as tf
    print(tf.__version__)
    
        2.4.0
    

    TF 2.x不需要运行以下语句:

    # If the tf.__version__ is 1.x, please run this cell
    # !pip install tensorflow==2.0.0-beta0
    

    数据获取:

    # Get the data
    dataset, info = tfds.load('imdb_reviews/subwords8k', with_info=True, as_supervised=True)
    train_dataset, test_dataset = dataset['train'], dataset['test']
    

    从数据集中导出已经构建好的分词器:

    tokenizer = info.features['text'].encoder
    

    设置训练和测试数据集

    BUFFER_SIZE = 10000
    BATCH_SIZE = 64
    
    train_dataset = train_dataset.shuffle(BUFFER_SIZE)
    train_dataset = train_dataset.padded_batch(BATCH_SIZE, tf.compat.v1.data.get_output_shapes(train_dataset))
    test_dataset = test_dataset.padded_batch(BATCH_SIZE, tf.compat.v1.data.get_output_shapes(test_dataset))
    

    构建GRU模型

    model = tf.keras.Sequential([
        tf.keras.layers.Embedding(tokenizer.vocab_size, 64), #嵌入的维度为64维
        tf.keras.layers.Bidirectional(tf.keras.layers.GRU(128)),
        tf.keras.layers.Dense(6, activation='relu'),
        tf.keras.layers.Dense(1, activation='sigmoid')
    ])
    

    查看网络结构

    model.summary()
    
        Model: "sequential"
        _________________________________________________________________
        Layer (type)                 Output Shape              Param #   
        =================================================================
        embedding (Embedding)        (None, None, 64)          523840    
        _________________________________________________________________
        bidirectional (Bidirectional (None, 256)               148992    
        _________________________________________________________________
        dense (Dense)                (None, 6)                 1542      
        _________________________________________________________________
        dense_1 (Dense)              (None, 1)                 7         
        =================================================================
        Total params: 674,381
        Trainable params: 674,381
        Non-trainable params: 0
        _________________________________________________________________
    

    编译并训练模型:

    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    
    NUM_EPOCHS = 50
    history = model.fit(train_dataset, epochs=NUM_EPOCHS, validation_data=test_dataset)
    
        Epoch 1/50
        391/391 [==============================] - 166s 392ms/step - loss: 0.6686 - accuracy: 0.5515 - val_loss: 0.7990 - val_accuracy: 0.5909
         ...
        Epoch 50/50
        391/391 [==============================] - 145s 371ms/step - loss: 1.2944e-06 - accuracy: 1.0000 - val_loss: 1.5420 - val_accuracy: 0.8560
    

    查看模型性能:

    import matplotlib.pyplot as plt
    
    
    def plot_graphs(history, string):
      plt.plot(history.history[string])
      plt.plot(history.history['val_'+string])
      plt.xlabel("Epochs")
      plt.ylabel(string)
      plt.legend([string, 'val_'+string])
      plt.show()
    
    plot_graphs(history, 'accuracy')
    plot_graphs(history, 'loss')
    
    output_12_0.png output_13_0.png

    资源释放

    import os, signal
    
    os.kill(os.getpid(), signal.SIGINT)
    

    四、多种模型的结合

    import json
    import tensorflow as tf
    import csv
    import random
    import numpy as np
    
    from tensorflow.keras.preprocessing.text import Tokenizer
    from tensorflow.keras.preprocessing.sequence import pad_sequences
    from tensorflow.keras.utils import to_categorical
    from tensorflow.keras import regularizers
    
    
    embedding_dim = 100
    max_length = 16
    trunc_type='post'
    padding_type='post'
    oov_tok = "<OOV>"
    training_size=160000
    test_portion=.1
    
    corpus = []
    
    
    (省略)
    
    
    
    model = tf.keras.Sequential([
        tf.keras.layers.Embedding(vocab_size+1, embedding_dim, input_length=max_length, weights=[embeddings_matrix], trainable=False),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Conv1D(64, 5, activation='relu'),
        tf.keras.layers.MaxPooling1D(pool_size=4),
        tf.keras.layers.LSTM(64),
        tf.keras.layers.Dense(1, activation='sigmoid')
    ])
    model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
    model.summary()
    
    num_epochs = 50
    training_sequences = np.array(training_sequences)
    training_labels = np.array(training_labels)
    test_sequences = np.array(test_sequences)
    test_labels = np.array(test_labels)
    history = model.fit(training_sequences, training_labels, epochs=num_epochs, validation_data=(test_sequences, test_labels), verbose=2)
    
    print("Training Complete")
    

    五、多种模型的性能比较

    # NOTE: PLEASE MAKE SURE YOU ARE RUNNING THIS IN A PYTHON3 ENVIRONMENT
    
    import tensorflow as tf
    print(tf.__version__)
    
    # This is needed for the iterator over the data
    # But not necessary if you have TF 2.0 installed
    #!pip install tensorflow==2.0.0-beta0
    
    
    # tf.enable_eager_execution()
    
    # !pip install -q tensorflow-datasets
    
        2.4.0
    

    加载数据集

    import tensorflow_datasets as tfds
    imdb, info = tfds.load("imdb_reviews", with_info=True, as_supervised=True)
    

    准备训练数据和测试数据

    import numpy as np
    
    # train_data, test_data = imdb['train'], imdb['test']
    train_data, test_data = imdb['train'].take(4000), imdb['test'].take(1000)
    
    training_sentences = []
    training_labels = []
    
    testing_sentences = []
    testing_labels = []
    
    # str(s.tonumpy()) is needed in Python3 instead of just s.numpy()
    for s,l in train_data:
      training_sentences.append(str(s.numpy()))
      training_labels.append(l.numpy())
      
    for s,l in test_data:
      testing_sentences.append(str(s.numpy()))
      testing_labels.append(l.numpy())
      
    training_labels_final = np.array(training_labels)
    testing_labels_final = np.array(testing_labels)
    

    超参数设定和文本序列化

    vocab_size = 10000
    embedding_dim = 16
    max_length = 120
    trunc_type='post'
    oov_tok = "<OOV>"
    
    
    from tensorflow.keras.preprocessing.text import Tokenizer
    from tensorflow.keras.preprocessing.sequence import pad_sequences
    
    tokenizer = Tokenizer(num_words = vocab_size, oov_token=oov_tok)
    tokenizer.fit_on_texts(training_sentences)
    word_index = tokenizer.word_index
    sequences = tokenizer.texts_to_sequences(training_sentences)
    padded = pad_sequences(sequences,maxlen=max_length, truncating=trunc_type)
    
    testing_sequences = tokenizer.texts_to_sequences(testing_sentences)
    testing_padded = pad_sequences(testing_sequences,maxlen=max_length)
    

    构建解码器查看输出编码转文本内容

    reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])
    
    def decode_review(text):
        return ' '.join([reverse_word_index.get(i, '?') for i in text])
    
    print(decode_review(padded[1]))
    print(training_sentences[1])
    
        ? ? ? ? ? ? ? b'i have been known to fall asleep during films but this is usually due to a combination of things including really tired being warm and comfortable on the <OOV> and having just eaten a lot however on this occasion i fell asleep because the film was rubbish the plot development was constant constantly slow and boring things seemed to happen but with no explanation of what was causing them or why i admit i may have missed part of the film but i watched the majority of it and everything just seemed to happen of its own <OOV> without any real concern for anything else i cant recommend this film at all '
        b'I have been known to fall asleep during films, but this is usually due to a combination of things including, really tired, being warm and comfortable on the sette and having just eaten a lot. However on this occasion I fell asleep because the film was rubbish. The plot development was constant. Constantly slow and boring. Things seemed to happen, but with no explanation of what was causing them or why. I admit, I may have missed part of the film, but i watched the majority of it and everything just seemed to happen of its own accord without any real concern for anything else. I cant recommend this film at all.'
        
    

    构建单层GRU模型并查看模型结构

    # Model Definition with GRU
    model = tf.keras.Sequential([
        tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
        tf.keras.layers.Bidirectional(tf.keras.layers.GRU(32)),
        tf.keras.layers.Dense(6, activation='relu'),
        tf.keras.layers.Dense(1, activation='sigmoid')
    ])
    model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
    model.summary()
    
        Model: "sequential"
        _________________________________________________________________
        Layer (type)                 Output Shape              Param #   
        =================================================================
        embedding (Embedding)        (None, 120, 16)           160000    
        _________________________________________________________________
        bidirectional (Bidirectional (None, 64)                9600      
        _________________________________________________________________
        dense (Dense)                (None, 6)                 390       
        _________________________________________________________________
        dense_1 (Dense)              (None, 1)                 7         
        =================================================================
        Total params: 169,997
        Trainable params: 169,997
        Non-trainable params: 0
        _________________________________________________________________
    

    模型训练:

    num_epochs = 50
    history = model.fit(padded, training_labels_final, epochs=num_epochs, validation_data=(testing_padded, testing_labels_final))
    
    
        Epoch 1/50
        125/125 [==============================] - 16s 53ms/step - loss: 0.6933 - accuracy: 0.4849 - val_loss: 0.6931 - val_accuracy: 0.4970
       ...
        Epoch 50/50
        125/125 [==============================] - 4s 28ms/step - loss: 3.5880e-06 - accuracy: 1.0000 - val_loss: 1.9096 - val_accuracy: 0.7550
    

    模型训练性能

    import matplotlib.pyplot as plt
    
    
    def plot_graphs(history, string):
      plt.plot(history.history[string])
      plt.plot(history.history['val_'+string])
      plt.xlabel("Epochs")
      plt.ylabel(string)
      plt.legend([string, 'val_'+string])
      plt.show()
    
    plot_graphs(history, 'accuracy')
    plot_graphs(history, 'loss')
    
    output_7_0.png output_7_1.png

    模型收敛速度很快,测试训练精确度不能再往上升,loss随轮数增加,模型过拟合。

    构建单层LSTM模型

    # Model Definition with single LSTM
    model = tf.keras.Sequential([
        tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
        tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32)),
        tf.keras.layers.Dense(6, activation='relu'),
        tf.keras.layers.Dense(1, activation='sigmoid')
    ])
    model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
    model.summary()
    
        Model: "sequential_1"
        _________________________________________________________________
        Layer (type)                 Output Shape              Param #   
        =================================================================
        embedding_1 (Embedding)      (None, 120, 16)           160000    
        _________________________________________________________________
        bidirectional_1 (Bidirection (None, 64)                12544     
        _________________________________________________________________
        dense_2 (Dense)              (None, 6)                 390       
        _________________________________________________________________
        dense_3 (Dense)              (None, 1)                 7         
        =================================================================
        Total params: 172,941
        Trainable params: 172,941
        Non-trainable params: 0
        _________________________________________________________________ 
    

    模型训练

    num_epochs = 50
    history = model.fit(padded, training_labels_final, epochs=num_epochs, validation_data=(testing_padded, testing_labels_final))
    
        Epoch 1/50
        125/125 [==============================] - 14s 49ms/step - loss: 0.6927 - accuracy: 0.5054 - val_loss: 0.6128 - val_accuracy: 0.7090
       ...
        Epoch 50/50
        125/125 [==============================] - 3s 27ms/step - loss: 4.3843e-05 - accuracy: 1.0000 - val_loss: 1.8278 - val_accuracy: 0.7590
    

    模型性能

    plot_graphs(history, 'accuracy')
    plot_graphs(history, 'loss')
    
    output_10_0.png output_10_1.png

    好像模型鲁棒性更加差了。。。。

    构建多层LSTM模型

    # Model Definition with multiple LSTM
    model = tf.keras.Sequential([
        tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
        tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32, return_sequences=True)),
        tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32)),
        tf.keras.layers.Dense(6, activation='relu'),
        tf.keras.layers.Dense(1, activation='sigmoid')
    ])
    model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
    model.summary()
    
        Model: "sequential_4"
        _________________________________________________________________
        Layer (type)                 Output Shape              Param #   
        =================================================================
        embedding_7 (Embedding)      (None, 120, 16)           160000    
        _________________________________________________________________
        bidirectional_2 (Bidirection (None, 120, 64)           12544     
        _________________________________________________________________
        bidirectional_3 (Bidirection (None, 64)                24832     
        _________________________________________________________________
        dense_8 (Dense)              (None, 6)                 390       
        _________________________________________________________________
        dense_9 (Dense)              (None, 1)                 7         
        =================================================================
        Total params: 197,773
        Trainable params: 197,773
        Non-trainable params: 0
        _________________________________________________________________
    

    模型训练

    num_epochs = 50
    history = model.fit(padded, training_labels_final, epochs=num_epochs, validation_data=(testing_padded, testing_labels_final))
    
    Epoch 1/50
    125/125 [==============================] - 24s 81ms/step - loss: 0.6917 - accuracy: 0.5152 - val_loss: 0.6868 - val_accuracy: 0.5030
    ...
    Epoch 50/50
    125/125 [==============================] - 5s 43ms/step - loss: 3.4054e-06 - accuracy: 1.0000 - val_loss: 2.3198 - val_accuracy: 0.7740
    
     模型性能:
    ```python
    plot_graphs(history, 'accuracy')
    plot_graphs(history, 'loss')
    
    output_13_0.png output_13_1.png

    构建单层卷积模型

    # Model Definition with Conv1D
    model = tf.keras.Sequential([
        tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
        tf.keras.layers.Conv1D(128, 5, activation='relu'),
        tf.keras.layers.GlobalAveragePooling1D(),
        tf.keras.layers.Dense(6, activation='relu'),
        tf.keras.layers.Dense(1, activation='sigmoid')
    ])
    model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
    model.summary()
    
    
        Model: "sequential_3"
        _________________________________________________________________
        Layer (type)                 Output Shape              Param #   
        =================================================================
        embedding_3 (Embedding)      (None, 120, 16)           160000    
        _________________________________________________________________
        conv1d_1 (Conv1D)            (None, 116, 128)          10368     
        _________________________________________________________________
        global_average_pooling1d_1 ( (None, 128)               0         
        _________________________________________________________________
        dense_6 (Dense)              (None, 6)                 774       
        _________________________________________________________________
        dense_7 (Dense)              (None, 1)                 7         
        =================================================================
        Total params: 171,149
        Trainable params: 171,149
        Non-trainable params: 0
        _________________________________________________________________
    

    模型训练

    num_epochs = 50
    history = model.fit(padded, training_labels_final, epochs=num_epochs, validation_data=(testing_padded, testing_labels_final))
    
        Epoch 1/50
        125/125 [==============================] - 6s 19ms/step - loss: 0.6895 - accuracy: 0.5145 - val_loss: 0.6109 - val_accuracy: 0.7300
       ...
        Epoch 50/50
        125/125 [==============================] - 1s 11ms/step - loss: 1.5977e-05 - accuracy: 1.0000 - val_loss: 1.4861 - val_accuracy: 0.7840
    
    plot_graphs(history, 'accuracy')
    plot_graphs(history, 'loss')
    
    output_16_0.png output_16_1.png

    资源释放:

    import os, signal
    
    os.kill(os.getpid(), signal.SIGINT)
    

    相关文章

      网友评论

          本文标题:【深度学习TensorFlow (12)】LSTM、卷积、GRU

          本文链接:https://www.haomeiwen.com/subject/qsfgfltx.html