复现两篇论文中的时间序列预测模型

作者: Cingti | 来源:发表于2019-10-08 20:06 被阅读0次

星期二, 08. 十月 2019 03:27下午
注：论文实现代码和论文.pdf都在git 账号下，欢迎交流讨论

论文题目：

Learning to Monitor Machine Health with Convolutional Bi-Directional LSTM Networks
Machine health monitoring using local feature-based gated recurrent unit networks

1.第一篇文章基本框架

fig_1.png
论文主要内容：通过论文提出的模型实现对故障的诊断（分类回归都可以实现）
论文的框架主要分为四个部分：
1、原始数据（可以是多个传感器数据）通过滑窗提取时频域特征，假设传感器数目为m，窗口数目为k，每个传感器提取的特征数目为n，则原始数据提过特征提取后的输入为[-1, k, m*n]，其中-1表示batch数目。
住：这一部分在框图和代码中没有体现，但是在论文中可以看出来。如果读者需要套用这个模型，需要自己实现这一部分的功能，如果在用原始数据输入（不经过特征提取）也可以取得很好效果，则这一部分可以省略
2、卷积部分实现空间特征提取，保留时间信息，代码如下:

    @staticmethod
    def cnn_layer(cnn_input=None, k=None, m=None, s=None, d=None):
        cnn1 = tf.contrib.layers.conv2d(cnn_input,
                                        num_outputs=k,
                                        kernel_size=[m, d],
                                        stride=[1, d],
                                        padding='VALID', )
        cnn1_pool = tf.nn.max_pool(cnn1,
                                   ksize=[1, s, 1, 1],
                                   strides=[1, s, 1, 1],
                                   padding='SAME',
                                   name='cnn1_max_pool')
        cnn1_shape = cnn1_pool.get_shape()
        cnn_out = tf.reshape(cnn1_pool, shape=[-1, cnn1_shape[1], cnn1_shape[-1]])
        return cnn_out

可以参考图3结合代码的实现理解，注意理解卷积层池化层的kernel_size和ksize，其中d表示数据的长度，即m*n的值：

2.png

3、两层双向LSTM的堆叠，主要用于在cnn输出的基础上提取时间信息，代码如下：

    @staticmethod
    def bilstm_layer(bilstm_input=None, num_units=None):
        # first bi-lstm cell
        with tf.variable_scope('1st-bi-lstm-layer', reuse=tf.AUTO_REUSE):
            cell_fw_1 = tf.nn.rnn_cell.LSTMCell(num_units=num_units[0], state_is_tuple=True)
            cell_bw_1 = tf.nn.rnn_cell.LSTMCell(num_units=num_units[0], state_is_tuple=True)
            outputs_1, states_1 = tf.nn.bidirectional_dynamic_rnn(cell_fw_1, cell_bw_1, inputs=bilstm_input,
                                                                  dtype=tf.float32)

        # second bi-lstm cell
        with tf.variable_scope('2nd-bi-lstm-layer', reuse=tf.AUTO_REUSE):
            # input_2 = tf.add(outputs_1[0], outputs_1[1])
            input_2 = tf.concat([outputs_1[0], outputs_1[1]], axis=2)
            cell_fw_2 = tf.nn.rnn_cell.LSTMCell(num_units=num_units[1], state_is_tuple=True)
            cell_bw_2 = tf.nn.rnn_cell.LSTMCell(num_units=num_units[1], state_is_tuple=True)
            outputs_2, states_2 = tf.nn.bidirectional_dynamic_rnn(cell_fw_2, cell_bw_2, inputs=input_2,
                                                                  dtype=tf.float32)

        # bilstm output
        with tf.variable_scope('bi-lstm-layer-output', reuse=tf.AUTO_REUSE):
            bilstm_out = tf.concat([states_2[0].h, states_2[1].h], axis=1)
        return bilstm_out

可以参考图c结合代码的实现理解，注意理解两个双向lstm层的拼接（关于该部分的实现也是根据论文原文实现的，如果有问题还请讨论交流）：

3.png

4、全连接层实现最终结果输出，这一部分的实现相对简单，主要对上一层最后在timestep输出的隐层特征作为输入得到最终的结果，代码如下所示：

    @staticmethod
    def fc_layer(fc_input=None, num_units=None, keep_prob=None):
        fc_input_ = tf.nn.dropout(fc_input, keep_prob=keep_prob)
        fc1 = tf.layers.dense(fc_input_, num_units[0], activation=tf.nn.relu,
                              kernel_initializer=tf.glorot_uniform_initializer())
        fc1_ = tf.nn.dropout(fc1, keep_prob=keep_prob)
        fc_out = tf.layers.dense(fc1_, num_units[1], activation=tf.nn.relu,
                                 kernel_initializer=tf.glorot_uniform_initializer())
        # fc_out = tf.layers.dense(fc_out, 1, activation=None, use_bias=False,
        #                          kernel_initializer=tf.glorot_normal_initializer())
        return fc_out

所有代码实现请参考本人git.

2.第二篇文章基本框架

4.png

论文主要内容：通过论文提出的模型实现对故障的诊断（分类回归都可以实现）
论文的框架主要分为四个部分：
1、原始数据（可以是多个传感器数据）通过滑窗提取时频域特征，假设传感器数目为m，窗口数目为k，每个传感器提取的特征数目为n，则原始数据提过特征提取后的输入为[-1, k, m*n]，其中-1表示batch数目。（这与论文1中提到的是一样的，而且可以从下图中看出来），这一部分的内容同样需要读者自己实现。
2、双向GRU实现时间特征提取，这一部分也相对简单，代码如下：

    @staticmethod
    def bigru_layer(bilstm_input=None, num_units=None):
        cell_fw = tf.nn.rnn_cell.GRUCell(num_units=num_units, name='fw')
        cell_bw = tf.nn.rnn_cell.GRUCell(num_units=num_units, name='bw')
        outputs, states = tf.nn.bidirectional_dynamic_rnn(cell_fw, cell_bw, inputs=bilstm_input, dtype=tf.float32)
        bigru_out = tf.concat([states[0], states[1]], axis=1)
        return bigru_out

3、权重平均化部分，这一部分主要通过fc实现，但是需要先对输入到gru的数据进行处理，参考原文公式和代码，由于原文公式较长，该部分只粘贴代码，其主要思想是实现不同窗口在同一个time step上做类似指数的平均，参考代码如下：

    @staticmethod
    def get_weight_average_layer(weight_average_input=None):
        _arr_weight_average_input = np.array(weight_average_input)
        _, T, _ = _arr_weight_average_input.shape
        _arr = []
        for ck in _arr_weight_average_input:    # every batch
            qk = np.array([np.exp(np.min([k - 1, T - k])) for k in range(1, T+1)])
            sigma_qk = np.sum(qk, dtype=np.float32)
            wk = np.array([qj * 1.0 / sigma_qk for qj in qk])
            c = np.array([wk[k]*ck[k] for k in range(T)]).sum(axis=0)
            _arr.append(c)
        return np.array(_arr)

4、这一部分相对简单，就是将第2和第3部分的结果进行concat再通过一个fc学习，代码如下：

    @staticmethod
    def fc_layer_2(fc_input=None, num_units=None, keep_prob=None):
        fc_input_ = tf.nn.dropout(fc_input, keep_prob=keep_prob)
        fc_out = tf.layers.dense(fc_input_, num_units, activation=tf.nn.relu,
                                 kernel_initializer=tf.glorot_uniform_initializer())
        # fc_out = tf.nn.dropout(fc, keep_prob=keep_prob)
        return fc_out

所有代码实现请参考本人git.

注：

代码中有任何问题欢迎与本人交流讨论；
代码需要支持tensorflow-gpu（>=1.10.0）才能运行；
git 上有参考论文和部分用于实验的数据，读者可以运行main_test文件对模型进行检验；
git 中data文件夹下的数据是一个回归问题，随着程序的运行mse会在100附近；
代码中所有参数的表示都在类中注明了。

网友评论

大数据爬虫Python AI Sql

本文标题：复现两篇论文中的时间序列预测模型

本文链接：https://www.haomeiwen.com/subject/mojipctx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

复现两篇论文中的时间序列预测模型

论文题目：

1.第一篇文章基本框架

2.第二篇文章基本框架

注：

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

大数据爬虫Python AI Sql

复现两篇论文中的时间序列预测模型

论文题目：

1.第一篇文章基本框架

2.第二篇文章基本框架

注：

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

大数据 爬虫Python AI Sql

大数据爬虫Python AI Sql