[tf]LSTM

作者: VanJordan | 来源:发表于2018-12-16 21:59 被阅读15次

    创建一个简单的LSTM

    在tensorflow中通过一句简单的命令就可以实现一个完整的LSTM结构。

    lstm = tf.nn.rnn_cell.BasicLSTMCell(lstm_hidden_size)
    

    将LSTM中的初始状态初始化全0数组使用.zero_state函数

    state = lstm.zero_state(batch_size,tf.float32)
    for i in range(num_steps):
        output,state = lstm.call(input,state)
    

    创建多层的LSTM

    创建深层的循环神经网络,同样可以使用 zero_state进行初始化。

    lstm_cell  = tf.nn.rnn_cell.BasicLSTMCell(lstm_hidden_size)
    stacked_lstm = tf.nn.rnn_cell.MultiRNNCell([lstm_cell(lstm_size) for _ in range(number_of_layers])
    state = stacked_lstm.zero_state(batch_size,tf.float32)1
    

    LSTM中使用Dropout

    tf.nn.rnn_cell.DropoutWrapper(
        cell,
        input_keep_prob=1.0,
        output_keep_prob=1.0,
        state_keep_prob=1.0,
        variational_recurrent=False,
        input_size=None,
        dtype=None,
        seed=None,
        dropout_state_filter_visitor=None
    )
    
    tf.nn.rnn_cell.BasicLSTMCell
    stacked_lstm = tf.nn.rnn_cell.MultiRNNCell([tf.nn.rnn_cell.DropoutWrapper(lstm_cell(lstm_size)) for _ in range(number_of_layers])
    

    BiLSTM

    tf.nn.bidirectional_dynamic_rnn(
        cell_fw, 
        cell_bw, 
        inputs, 
        initial_state_fw=None, 
        initial_state_bw=None, 
        sequence_length=None, 
        dtype=None, 
        parallel_iterations=None, 
        swap_memory=False, 
        time_major=False, 
        scope=None
    )
    

    输出 (outputs, output_states) :

    • outputs: 输出是 time_steps 步里所有的输出, 它是一个元组(output_fw, output_bw)包含了前向和后向的输出结果,每一个结果的形状为 [batch_size, max_time, cell_fw.output_size]
      It returns a tuple instead of a single concatenated Tensor. If the concatenated one is preferred, the forward and backward outputs can be concatenated as tf.concat(outputs, 2)
    • output_states:是一个元组 (output_state_fw, output_state_bw),包含前向和后向的最后一步的状态。

    dynamic_rnn

    • 使用dynamic_rnn的时候每个batch的最大序列长度不需要相同,第一个batch的维度可以是2 * 4,第二个batch的维度是2 * 7,在训练的时候dynamic_rnn会根据每个batch的最大长度动态的展开到需要的层数,这就是它被称为dynamic的原因。
      可以看到虽然上面的LSTM可以是一个batch的输入,但是每次运算LSTM的时候只能在时间步上前进一步,相当于PyTorch中的LSTMCell,那么什么函数相当于PyTorch中的LSTM呢,答案是tf.nn.dynamic_rnn
    tf.nn.dynamic_rnn(cell, inputs, 
        initial_state=None, 
        sequence_length=None, 
        dtype=None, 
        parallel_iterations=None, 
        swap_memory=False, 
        time_major=False, 
        scope=None
    )
    

    输入参数

    • cell: 一个 RNNCell 实例对象
    • inputs: RNN 的输入序列
    • initial_state: RNN 的初始状态, If cell.state_size is an integer, this must be a Tensor of appropriate type and shape [batch_size, cell.state_size]. If cell.state_size is a tuple, this should be a tuple of tensors having shapes [batch_size, s] for s in cell.state_size.
    • sequence_length: 形状为 [batch_size]其中的每一个值为 sequence length(即 time_steps), eg:sequence_length=tf.fill([batch_size], time_steps)
    • time_major: 默认为 False,输入和输出张量的形状为 [batch_size, max_time, depth];当取 True 的时候, it avoids transposes at the beginning and end of the RNN calculation,输入和输出张量的形状为 [max_time, batch_size, depth]
    • scope: VariableScope for the created subgraph; defaults to “rnn”.

    输出:

    • outputs:是 time_steps 步里所有的输出,形状为[batch_size, max_time, cell.output_size]
    • state:是最后一步的隐状态,形状为[batch_size, cell.state_size]

    相关文章

      网友评论

          本文标题:[tf]LSTM

          本文链接:https://www.haomeiwen.com/subject/emxfkqtx.html