Bert

作者: JasonWayne | 来源:发表于2019-06-13 21:00 被阅读0次

    结构

    Loss

    实际是Multi Task Training

    // run_pretraining.py:114
        (masked_lm_loss,
         masked_lm_example_loss, masked_lm_log_probs) = get_masked_lm_output(
             bert_config, model.get_sequence_output(), model.get_embedding_table(),
             masked_lm_positions, masked_lm_ids, masked_lm_weights)
    
        (next_sentence_loss, next_sentence_example_loss,
         next_sentence_log_probs) = get_next_sentence_output(
             bert_config, model.get_pooled_output(), next_sentence_labels)
    
        total_loss = masked_lm_loss + next_sentence_loss
    

    Optimizer

    思想

    Trick

    拾遗

    相关文章

      网友评论

          本文标题:Bert

          本文链接:https://www.haomeiwen.com/subject/kuvefctx.html