Tensorflow2.0 循环神经网络RNN使用示例

作者: FredricZhu | 来源:发表于2024-04-18 20:30 被阅读0次

「深度学习」循环神经网络 RNN 学习笔记
深度学习_RNN循环神经网络，序列模型
深度学习_循环神经网络RNN与LSTM
深度学习笔记之循环神经网络RNN学习笔记
浅析RNN循环神经网络
2018-11-11
7.循环神经网络(RNN) 基础讲解
2020-02-14
2019-02-27 Lstm函数
几种常见的循环神经网络结构RNN、LSTM、GRU

本例是一个非常简单的数据集，用来预测奶牛最后一年12个月的牛奶产量。训练集是前13年每个月的牛奶产量。
本例旨在提供一种构建RNN(LSTM)网络的训练集和测试集的方法。
数据集下载地址为,
https://gitlab.com/zhuge20100104/cpp_practice/-/blob/master/simple_learn/deep_learning/13_use_case_implementation_of_rnn/monthly-milk-production-pounds-p.csv?ref_type=heads
可以直接在该页面下载，不用去到处寻找了，我也是自己谷歌找到的。
完整的Jupyter notebook地址如下， https://gitlab.com/zhuge20100104/cpp_practice/-/blob/master/simple_learn/deep_learning/13_use_case_implementation_of_rnn/13.%20Use%20Case%20Implementation%20of%20RNN.ipynb?ref_type=heads

代码如下，

# 其实是用前12个月的数据，预测后12个月的数据，中间的11个月是重合的，
# 所以只有最后一个月的数据是有用的，是预测出来的。
# 这个预测出来的数据，又被feedback到原来的数据里面去，接着做预测
# tf v1.0玩法

# 1. 根据时序数据预测一头牛每个月产多少牛奶

# 引入库和读入数据
# index_col = 'Month'
# Month列做index

# import the necessary libraries
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt

%matplotlib inline

# Read the dataset and print the head of it
milk = pd.read_csv('./monthly-milk-production-pounds-p.csv', index_col='Month')
milk.head()

# 可视化数据

# 3. Convert the index to time series
milk.index = pd.to_datetime(milk.index)

# 4. Plot the time series data
milk.plot()

# train_test_split
# 用前12年的数据作为训练集
# 后1年的数据作为测试集

# 后一年的数据 是12个月，是由循环神经网络1个月1个月的预测出来的

# 5. Perform the train test split on the data

milk.info()


# We take the 13 years data for training
train_set = milk.head(156)

# remaining 1 year data for testing

test_set = milk.tail(12)

# 标准化数据
# 6. Scale the data using standard machine learning process
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()
train_scaled = scaler.fit_transform(train_set)
test_scaled = scaler.transform(test_set)


# 7. Define your customized data generator
def next_batch(training_data, batch_size, steps):
    while True:
        # Grab a random starting point for each batch
        rand_start = np.random.randint(0, len(training_data) - steps)
        # Create Y data for time series in the batches
        y_batch = np.array(training_data[rand_start: rand_start + steps + 1]).reshape(1, steps + 1)
        # 分别取 前steps 个 和 后 steps个，其中有 steps -1个重合的
        yield y_batch[:,:-1].reshape(-1, steps, 1), y_batch[:, 1:].reshape(-1, steps, 1)

# 8. Setting up the RNN model
# import tensorflow
import tensorflow as tf

num_inputs = 1
# Num of timesteps in each batch
num_time_steps = 12
# 100 neuron layer, play with this
num_neurons = 100
# Just one output, predicted time series
num_outputs = 1

# You can also try increasing iterations, but decreasing learning rate
# learning_rate you can play with this
learning_rate = 0.03
# how many iterations to go through(training steps), you can play with this
num_train_iterations = 4000
# size of the batch of data
batch_size = 1


# Define your RNN model
class MyRNN(tf.keras.Model):
    def __init__(self, hidden_size, num_outputs):
        super(MyRNN, self).__init__()
        self.rnn_cell = tf.keras.layers.GRU(units=hidden_size, return_sequences=True)
        self.projection_layer = tf.keras.layers.Dense(units=num_outputs)
    
    def call(self, inputs):
        rnn_output = self.rnn_cell(inputs)
        output = self.projection_layer(rnn_output)
        return output

model = MyRNN(num_neurons, num_outputs)
model.compile(optimizer='adam', loss='mse')
model.fit(next_batch(train_scaled, batch_size, num_time_steps), steps_per_epoch=num_train_iterations)


train_seed = list(train_scaled[-12:])
# 每次用上次产生的1 + 前面的11 接着往后面预测
for iteration in range(12):
    x_batch = np.array(train_seed[-num_time_steps:]).reshape(1, num_time_steps, 1)
    y_pred = model.predict(x_batch)
    # 预测出来的最后一个值
    print(y_pred[0,-1, 0])
    # 放到train_seed的最后，参与下一次的预测工作
    train_seed.append(y_pred[0,-1, 0])
    

train_seed[-12:], train_seed[12:]

# 17. Reshape the results
results = scaler.inverse_transform(np.array(train_seed[12:]).reshape(12, 1))

test_set['Generated'] = results


# 查看最终的test_set DataFrame
test_set

# Plot the predicted result and actual result
test_set.plot()

最终预测的效果如下，趋势还是对的。

image.png